跳至内容

✨ 简介

Ragas是一个提供工具来增强大型语言模型(LLM)应用评估能力的库。它旨在帮助您轻松自信地评估您的LLM应用程序。

  • 🚀 快速开始

    使用pip安装并通过这些教程开始使用Ragas。

    快速开始

  • 📚 核心概念

    深入解释和讨论Ragas中不同功能的概念和工作原理。

    Core Concepts

  • 🛠️ 操作指南

    实用指南帮助您达成特定目标。查看这些指南,学习如何使用Ragas解决实际问题。

    How-to Guides

  • 📖 参考文档

    Ragas类和方法工作原理的技术说明。

    References

常见问题解答

What is the best open-source model to use?
There isn't a single correct answer to this question. With the rapid pace of AI model development, new open-source models are released every week, often claiming to outperform previous versions. The best model for your needs depends largely on your GPU capacity and the type of data you're working with. It's a good idea to explore newer, widely accepted models with strong general capabilities. You can refer to 此列表 for available open-source models, their release dates, and fine-tuned variants.
Why do NaN values appear in evaluation results?
NaN stands for "Not a Number." In ragas evaluation results, NaN can appear for two main reasons:
  • JSON解析问题: 模型的输出无法被JSON解析。ragas要求模型输出JSON兼容的响应,因为所有提示都是使用Pydantic结构化的。这确保了LLM输出的高效解析。
  • 不适合评分的情况: 样本中的某些情况可能不适合进行评分。例如,对"我不知道"这类回答进行真实性评分可能就不太合适。
How can I make evaluation results more explainable?
The best way is to trace and log your evaluation, then inspect the results using LLM traces. You can follow a detailed example of this process here.