跳至内容

✨ 介绍

Ragas 是一个库,提供用于强化大型语言模型(LLM)应用评估的工具。它旨在帮助您轻松自信地评估您的 LLM 应用。

  • 🚀 入门

    使用 pip 安装,并通过这些教程开始使用 Ragas。

    Get Started

  • 📚 核心概念

    深入解释和讨论 Ragas 中可用的不同功能的概念和工作原理。

    Core Concepts

  • 🛠️ 使用指南

    实用指南,帮助你实现特定目标。查看这些指南,学习如何使用 Ragas 解决现实问题。

    How-to Guides

  • 📖 参考资料

    关于 Ragas 类和方法如何工作的技术说明。

    References

常见问题解答

What is the best open-source model to use?
There isn't a single correct answer to this question. With the rapid pace of AI model development, new open-source models are released every week, often claiming to outperform previous versions. The best model for your needs depends largely on your GPU capacity and the type of data you're working with.

It's a good idea to explore newer, widely accepted models with strong general capabilities. You can refer to this list for available open-source models, their release dates, and fine-tuned variants.
Why do NaN values appear in evaluation results?
NaN stands for "Not a Number." In ragas evaluation results, NaN can appear for two main reasons:
  • JSON 解析问题:模型的输出无法被 JSON 解析。ragas 要求模型输出与 JSON 兼容的响应,因为所有提示都是使用 Pydantic 结构化的。这可确保对 LLM 输出的高效解析。
How can I make evaluation results more explainable?
The best way is to trace and log your evaluation, then inspect the results using LLM traces. You can follow a detailed example of this process here.