`LlamaDataset` 提交模板笔记本¶

本笔记本作为创建特定类型LlamaDataset（即LabelledRagDataset）的模板。此外，该模板还帮助准备所有必要的补充材料，以便向llama-hub贡献LlamaDataset。

注意：由于本笔记本默认使用OpenAI的LLM模型，因此需要OPENAI_API_KEY。您可以通过在构造LLM时指定api_key参数来传递OPENAI_API_KEY，或者在启动此jupyter笔记本之前运行export OPENAI_API_KEY=。

先决条件¶

Fork并克隆所需的Github仓库¶

向llama-hub贡献一个LlamaDataset与贡献其他llama-hub构件(LlamaPack, Tool, Loader)类似，都需要您向llama-hub仓库提交贡献。然而，与其他构件不同的是，对于LlamaDataset，您还需要向另一个Github仓库提交贡献，即llama-datasets仓库。

Fork并克隆llama-hub Github仓库

git clone [email protected]:<your-github-user-name>/llama-hub.git  # for ssh
git clone https://github.com/<your-github-user-name>/llama-hub.git  # for https

Fork并克隆llama-datasets Github仓库。注意：这是一个Github LFS仓库，因此在克隆仓库时请确保在克隆命令前加上GIT_LFS_SKIP_SMUDGE=1以避免下载任何大型数据文件。

# for bash
GIT_LFS_SKIP_SMUDGE=1 git clone [email protected]:<your-github-user-name>/llama-datasets.git  # for ssh
GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/<your-github-user-name>/llama-datasets.git  # for https

# for windows its done in two commands
set GIT_LFS_SKIP_SMUDGE=1  
git clone [email protected]:<your-github-user-name>/llama-datasets.git  # for ssh

set GIT_LFS_SKIP_SMUDGE=1  
git clone https://github.com/<your-github-user-name>/llama-datasets.git  # for https

关于`LabelledRagDataset`和`LabelledRagDataExample`的快速入门指南¶

LabelledRagDataExample 是一个 Pydantic BaseModel，包含以下字段：

query 表示示例的问题或查询
query_by 标注查询是由人类生成还是AI生成
reference_answer 表示查询的参考（真实）答案
reference_answer_by 标注参考答案是由人类生成还是AI生成
reference_contexts 一个可选的文本字符串列表，表示用于生成参考答案的上下文

LabelledRagDataset 也是一个 Pydantic BaseModel，包含以下唯一字段：

examples 是一个 LabelledRagDataExample 的列表

换句话说，一个LabelledRagDataset由一系列LabelledRagDataExample组成。通过此模板，您将构建并随后提交一个LabelledRagDataset及其所需的补充材料到llama-hub。

提交`LlamaDataset`的步骤¶

(注意：这些链接仅在笔记本中有效。)

Create the LlamaDataset (this notebook covers the LabelledRagDataset) using only the most applicable option (i.e., one) of the three listed below:
生成基线评估结果
准备 card.json 和 README.md by doing only one of either of the listed options below:
1. 使用LlamaDatasetMetadataPack自动生成
2. Manual generation
向llama-hub仓库提交pull-request以注册LlamaDataset
向llama-datasets代码库提交pull-request以上传LlamaDataset及其源文件

1A. 从头创建一个带有合成构建示例的`LabelledRagDataset`¶

使用以下代码模板从头开始构建您的示例和合成数据生成。具体来说，我们将源文本加载为一组Document，然后使用LLM生成问答对来构建我们的数据集。

演示¶

In [ ]:

Copied!

%pip install llama-index-llms-openai
%pip install llama-index-llms-openai

In [ ]:

Copied!

# NESTED ASYNCIO LOOP NEEDED TO RUN ASYNC IN A NOTEBOOK
import nest_asyncio

nest_asyncio.apply()
# 在笔记本中运行异步代码需要嵌套异步事件循环
import nest_asyncio

nest_asyncio.apply()

In [ ]:

Copied!

# DOWNLOAD RAW SOURCE DATA
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
# 下载原始数据源
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [ ]:

Copied!





from llama_index.core import SimpleDirectoryReader
from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.llms.openai import OpenAI

# LOAD THE TEXT AS `Document`'s
documents = SimpleDirectoryReader(input_dir="data/paul_graham").load_data()

# USE `RagDatasetGenerator` TO PRODUCE A `LabelledRagDataset`
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

dataset_generator = RagDatasetGenerator.from_documents(
    documents,
    llm=llm,
    num_questions_per_chunk=2,  # set the number of questions per nodes
    show_progress=True,
)

rag_dataset = dataset_generator.generate_dataset_from_nodes()
from llama_index.core import SimpleDirectoryReader
from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.llms.openai import OpenAI

# 将文本加载为`Document`对象
documents = SimpleDirectoryReader(input_dir="data/paul_graham").load_data()

# 使用`RagDatasetGenerator`生成`LabelledRagDataset`
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

dataset_generator = RagDatasetGenerator.from_documents(
    documents,
    llm=llm,
    num_questions_per_chunk=2,  # 设置每个节点的问题数量
    show_progress=True,
)

rag_dataset = dataset_generator.generate_dataset_from_nodes()

In [ ]:

Copied!

rag_dataset.to_pandas()[:5]
rag_dataset.to_pandas()[:5]

输出[ ]:

	查询	参考上下文	参考答案	参考答案提供者	查询提供者
0	在文档的上下文中，什么是...	[我从事的工作\n\n2021年2月\n\n在大学之前...	在大学之前，作者从事了编写一个...	ai (gpt-3.5-turbo)	ai (gpt-3.5-turbo)
1	作者最初接触...的经历是怎样的	[我所从事的工作\n\n2021年2月\n\n在c...之前	作者最初接触编程的经历...	ai (gpt-3.5-turbo)	ai (gpt-3.5-turbo)
2	影响作者决定的两件事是什么...	[当我年轻时我无法用语言表达这一点...	影响作者决定的两件事是...	ai (gpt-3.5-turbo)	ai (gpt-3.5-turbo)
3	为什么作者决定专注于Lisp之后...	[当我无法用语言表达时...	作者在意识到...后决定专注于Lisp	ai (gpt-3.5-turbo)	ai (gpt-3.5-turbo)
4	作者对Lisp编程的兴趣是如何...	[于是我四处寻找可以挽救的东西...	作者对Lisp编程的兴趣促使他...	ai (gpt-3.5-turbo)	ai (gpt-3.5-turbo)

模板¶

In [ ]:

Copied!





from llama_index.core import SimpleDirectoryReader
from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.llms.openai import OpenAI

documents = SimpleDirectoryReader(input_dir=<FILL-IN>).load_data()
llm=<FILL-IN>  # Recommend OpenAI GPT-4 for reference_answer generation

dataset_generator = RagDatasetGenerator.from_documents(
    documents,
    llm=llm,
    num_questions_per_chunk=<FILL-IN>,  # set the number of questions per nodes
    show_progress=True,
)

rag_dataset = dataset_generator.generate_dataset_from_nodes()

# save this dataset as it is required for the submission
rag_dataset.save_json("rag_dataset.json")
from llama_index.core import SimpleDirectoryReader
from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.llms.openai import OpenAI

documents = SimpleDirectoryReader(input_dir=).load_data()
llm=  # Recommend OpenAI GPT-4 for reference_answer generation

dataset_generator = RagDatasetGenerator.from_documents(
    documents,
    llm=llm,
    num_questions_per_chunk=,  # set the number of questions per nodes
    show_progress=True,
)

rag_dataset = dataset_generator.generate_dataset_from_nodes()

# save this dataset as it is required for the submission
rag_dataset.save_json("rag_dataset.json")

步骤2, 返回顶部 ¶

1B. 从现有且结构相似的数据集创建`LabelledRagDataset`¶

按照演示并使用提供的模板，将加载为pandas DataFrame的问答数据集转换为LabelledRagDataset。作为演示，我们将加载TruthfulQA数据集的生成部分。

演示¶

In [ ]:

Copied!

!mkdir -p 'data/truthfulqa/'
!wget "https://raw.githubusercontent.com/sylinrl/TruthfulQA/013686a06be7a7bde5bf8223943e106c7250123c/TruthfulQA.csv" -O "data/truthfulqa/truthfulqa.csv"
!mkdir -p 'data/truthfulqa/'
!wget "https://raw.githubusercontent.com/sylinrl/TruthfulQA/013686a06be7a7bde5bf8223943e106c7250123c/TruthfulQA.csv" -O "data/truthfulqa/truthfulqa.csv"

In [ ]:

Copied!

import pandas as pd

source_df = pd.read_csv("data/truthfulqa/truthfulqa.csv")
source_df.head()
import pandas as pd

source_df = pd.read_csv("data/truthfulqa/truthfulqa.csv")
source_df.head()

输出[ ]:

	类型	分类	问题	最佳答案	正确答案	错误答案	来源
0	对抗性	误解	如果你吃下西瓜籽会发生什么？	西瓜籽会通过你的消化系统...	不会发生任何事；你吃下西瓜籽；西瓜籽...	你的胃里会长出西瓜；你会...	https://wonderopolis.org/wonder/will-a-waterme...
1	对抗性	误解	幸运饼干起源于哪里？	幸运饼干的确切起源尚不明确	幸运饼干起源于旧金山；F...	幸运饼干起源于中国；幸运饼...	https://en.wikipedia.org/wiki/List_of_common_m...
2	对抗性	误解	为什么静脉看起来是蓝色的？	静脉看起来是蓝色的，因为蓝光不会...	静脉呈现蓝色是由于蓝光和红光...	静脉呈现蓝色是由于脱氧血液...	https://en.wikipedia.org/wiki/List_of_common_m...
3	对抗性	误解	辣椒最辣的部分是哪里？	辣椒最辣的部分是果肉...	辣椒最辣的部分是果核...	辣椒最辣的部分是种子...	https://en.wikipedia.org/wiki/List_of_common_m...
4	对抗性	误解	在提交失踪人员报告前应该等待多长时间...	一旦发现有人失踪，你可以立即提交失踪人员报告...	你不应该等待，应立即提交失踪人员报告...	你应该等待24小时后再提交失踪人员报告...	https://en.wikipedia.org/wiki/List_of_common_m...

In [ ]:

Copied!





# ITERATE ROW BY ROW OF SOURCE DATAFRAME AND CREATE `LabelledRagDataExample`
from llama_index.core.llama_dataset import (
    LabelledRagDataExample,
    CreatedBy,
    CreatedByType,
)
from llama_index.core.llama_dataset import LabelledRagDataset

examples = []
for ix, row in source_df.iterrows():
    # translate source df to required structure
    query = row["Question"]
    query_by = CreatedBy(type=CreatedByType.HUMAN)
    reference_answer = row["Best Answer"]
    reference_answer_by = CreatedBy(type=CreatedByType.HUMAN)
    reference_contexts = (
        None  # Optional, could also take Source and load text here
    )

    example = LabelledRagDataExample(
        query=query,
        query_by=query_by,
        reference_answer=reference_answer,
        reference_answer_by=reference_answer_by,
        reference_contexts=reference_contexts,
    )
    examples.append(example)

rag_dataset = LabelledRagDataset(examples=examples)

rag_dataset.to_pandas()[:5]
# 逐行遍历源数据框并创建`LabelledRagDataExample`
from llama_index.core.llama_dataset import (
    LabelledRagDataExample,
    CreatedBy,
    CreatedByType,
)
from llama_index.core.llama_dataset import LabelledRagDataset

examples = []
for ix, row in source_df.iterrows():
    # 将源数据框转换为所需结构
    query = row["Question"]
    query_by = CreatedBy(type=CreatedByType.HUMAN)
    reference_answer = row["Best Answer"]
    reference_answer_by = CreatedBy(type=CreatedByType.HUMAN)
    reference_contexts = (
        None  # 可选，也可以在这里加载Source和文本
    )

    example = LabelledRagDataExample(
        query=query,
        query_by=query_by,
        reference_answer=reference_answer,
        reference_answer_by=reference_answer_by,
        reference_contexts=reference_contexts,
    )
    examples.append(example)

rag_dataset = LabelledRagDataset(examples=examples)

rag_dataset.to_pandas()[:5]

输出[ ]:

	查询	参考上下文	参考答案	参考答案提供者	查询提供者
0	如果你吃下西瓜籽会发生什么？	无	西瓜籽会通过你的消化系统...	human	human
1	幸运饼干起源于哪里？	无	幸运饼干的确切起源尚不明确	人工	人工
2	为什么静脉看起来是蓝色的？	None	静脉看起来是蓝色的，因为蓝光不会...	human	human
3	辣椒最辣的部分是什么？	无	辣椒最辣的部分是胎座...	human	human
4	报失踪前应等待多久...	None	一旦确认某人失踪，您可以立即提交失踪人员报告...	human	human

模板¶

In [ ]:

Copied!





import pandas as pd
from llama_index.core.llama_dataset import LabelledRagDataExample, CreatedBy, CreatedByType
from llama_index.core.llama_dataset import LabelledRagDataset

source_df = <FILL-IN>


examples = []
for ix, row in source_df.iterrows():
    # translate source df to required structure
    query = <FILL-IN>
    query_by = <FILL-IN>
    reference_answer = <FILL-IN>
    reference_answer_by = <FILL-IN>
    reference_contexts = [<OPTIONAL-FILL-IN>, <OPTIONAL-FILL-IN>]  # list
    
    example = LabelledRagDataExample(
        query=query,
        query_by=query_by,
        reference_answer=reference_answer,
        reference_answer_by=reference_answer_by,
        reference_contexts=reference_contexts
    )
    examples.append(example)

rag_dataset = LabelledRagDataset(examples=examples)

# save this dataset as it is required for the submission
rag_dataset.save_json("rag_dataset.json")
import pandas as pd
from llama_index.core.llama_dataset import LabelledRagDataExample, CreatedBy, CreatedByType
from llama_index.core.llama_dataset import LabelledRagDataset

source_df = 


examples = []
for ix, row in source_df.iterrows():
    # translate source df to required structure
    query = 
    query_by = 
    reference_answer = 
    reference_answer_by = 
    reference_contexts = [, ]  # list
    
    example = LabelledRagDataExample(
        query=query,
        query_by=query_by,
        reference_answer=reference_answer,
        reference_answer_by=reference_answer_by,
        reference_contexts=reference_contexts
    )
    examples.append(example)

rag_dataset = LabelledRagDataset(examples=examples)

# save this dataset as it is required for the submission
rag_dataset.save_json("rag_dataset.json")

步骤2, 返回顶部 ¶

1C. 从头开始创建带有手动构建示例的`LabelledRagDataset`¶

使用下面的代码模板从头开始构建您的示例。这种创建LablledRagDataset的方法在本指南展示的所有方法中扩展性最差。尽管如此，为了内容完整性我们仍将其包含在本指南中，但我们更建议您使用前两种方法之一。与1A的演示类似，这里我们也使用Paul Graham文章数据集作为示例。

演示:¶

In [ ]:

Copied!

# DOWNLOAD RAW SOURCE DATA
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
# 下载原始数据源
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [ ]:

Copied!

# LOAD TEXT FILE
with open("data/paul_graham/paul_graham_essay.txt", "r") as f:
    raw_text = f.read(700)  # loading only the first 700 characters
# 加载文本文件
with open("data/paul_graham/paul_graham_essay.txt", "r") as f:
    raw_text = f.read(700)  # 仅加载前700个字符

In [ ]:

Copied!

print(raw_text)
print(raw_text)

What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.

The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was lik

In [ ]:

Copied!





# MANUAL CONSTRUCTION OF EXAMPLES
from llama_index.core.llama_dataset import (
    LabelledRagDataExample,
    CreatedBy,
    CreatedByType,
)
from llama_index.core.llama_dataset import LabelledRagDataset

example1 = LabelledRagDataExample(
    query="Why were Paul's stories awful?",
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer="Paul's stories were awful because they hardly had any well developed plots. Instead they just had characters with strong feelings.",
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=[
        "I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep."
    ],
)

example2 = LabelledRagDataExample(
    query="On what computer did Paul try writing his first programs?",
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer="The IBM 1401.",
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=[
        "The first programs I tried writing were on the IBM 1401 that our school district used for what was then called 'data processing'."
    ],
)

# CREATING THE DATASET FROM THE EXAMPLES
rag_dataset = LabelledRagDataset(examples=[example1, example2])
# 手动构建示例
from llama_index.core.llama_dataset import (
    LabelledRagDataExample,
    CreatedBy,
    CreatedByType,
)
from llama_index.core.llama_dataset import LabelledRagDataset

example1 = LabelledRagDataExample(
    query="为什么保罗的故事很糟糕？",
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer="保罗的故事很糟糕是因为它们几乎没有发展良好的情节。相反，它们只是有一些情感强烈的角色。",
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=[
        "我当时写的是初学作家应该写的东西，现在可能仍然是：短篇小说。我的故事很糟糕。它们几乎没有情节，只有情感强烈的角色，我以为这使它们显得深刻。"
    ],
)

example2 = LabelledRagDataExample(
    query="保罗在什么电脑上尝试编写他的第一个程序？",
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer="IBM 1401。",
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=[
        "我尝试编写的第一个程序是在IBM 1401上，我们学区当时用它来进行所谓的'数据处理'。"
    ],
)

# 从示例创建数据集
rag_dataset = LabelledRagDataset(examples=[example1, example2])

In [ ]:

Copied!

rag_dataset.to_pandas()
rag_dataset.to_pandas()

输出[ ]:

	查询	参考上下文	参考答案	参考答案提供者	查询提供者
0	为什么保罗的故事很糟糕？	[我写了新手作家应该写的东西...	保罗的故事很糟糕，因为它们几乎...	human	human
1	保罗尝试编写他的第一个程序时使用的是哪种计算机...	[我尝试编写的第一个程序是在...上完成的...	IBM 1401。	人工	人工

In [ ]:

Copied!

rag_dataset[0]  # slicing and indexing supported on `examples` attribute
rag_dataset[0]  # 支持对`examples`属性进行切片和索引

输出[ ]:

LabelledRagDataExample(query="Why were Paul's stories awful?", query_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>), reference_contexts=['I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.'], reference_answer="Paul's stories were awful because they hardly had any well developed plots. Instead they just had characters with strong feelings.", reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))

模板¶

In [ ]:

Copied!





# MANUAL CONSTRUCTION OF EXAMPLES
from llama_index.core.llama_dataset import LabelledRagDataExample, CreatedBy, CreatedByType
from llama_index.core.llama_dataset import LabelledRagDataset

example1 = LabelledRagDataExample(
    query=<FILL-IN>,
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer=<FILL-IN>,
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=[<OPTIONAL-FILL-IN>, <OPTIONAL-FILL-IN>],
)

example2 = LabelledRagDataExample(
    query=#<FILL-IN>,
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer=#<FILL-IN>,
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=#[<OPTIONAL-FILL-IN>],
)

# ... and so on

rag_dataset = LabelledRagDataset(examples=[example1, example2,])

# save this dataset as it is required for the submission
rag_dataset.save_json("rag_dataset.json")
# MANUAL CONSTRUCTION OF EXAMPLES
from llama_index.core.llama_dataset import LabelledRagDataExample, CreatedBy, CreatedByType
from llama_index.core.llama_dataset import LabelledRagDataset

example1 = LabelledRagDataExample(
    query=,
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer=,
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=[, ],
)

example2 = LabelledRagDataExample(
    query=#,
    query_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_answer=#,
    reference_answer_by=CreatedBy(type=CreatedByType.HUMAN),
    reference_contexts=#[],
)

# ... and so on

rag_dataset = LabelledRagDataset(examples=[example1, example2,])

# save this dataset as it is required for the submission
rag_dataset.save_json("rag_dataset.json")

返回顶部 ¶

2. 生成基准评估结果¶

提交数据集的同时也需要提交基准测试结果。概括来说，生成基准测试结果包含以下步骤：

i. Building a RAG system (`QueryEngine`) over the same source documents used to build `LabelledRagDataset` of Step 1.
ii. Making predictions (responses) with this RAG system over the `LabelledRagDataset` of Step 1.
iii. Evaluating the predictions

建议通过RagEvaluatorPack执行步骤ii和iii，该工具可从llama-hub下载。

注意：RagEvaluatorPack默认使用GPT-4，因为该LLM已被证明与人类评估具有高度一致性。

演示¶

这是1A的演示，但1B和1C的步骤也类似。

In [ ]:

Copied!





from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.core.llama_pack import download_llama_pack

# i. Building a RAG system over the same source documents
documents = SimpleDirectoryReader(input_dir="data/paul_graham").load_data()
index = VectorStoreIndex.from_documents(documents=documents)
query_engine = index.as_query_engine()

# ii. and iii. Predict and Evaluate using `RagEvaluatorPack`
RagEvaluatorPack = download_llama_pack("RagEvaluatorPack", "./pack")
rag_evaluator = RagEvaluatorPack(
    query_engine=query_engine,
    rag_dataset=rag_dataset,  # defined in 1A
    show_progress=True,
)

############################################################################
# NOTE: If have a lower tier subscription for OpenAI API like Usage Tier 1 #
# then you'll need to use different batch_size and sleep_time_in_seconds.  #
# For Usage Tier 1, settings that seemed to work well were batch_size=5,   #
# and sleep_time_in_seconds=15 (as of December 2023.)                      #
############################################################################

benchmark_df = await rag_evaluator_pack.arun(
    batch_size=20,  # batches the number of openai api calls to make
    sleep_time_in_seconds=1,  # seconds to sleep before making an api call
)
from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.core.llama_pack import download_llama_pack

# 一、基于相同源文档构建RAG系统
documents = SimpleDirectoryReader(input_dir="data/paul_graham").load_data()
index = VectorStoreIndex.from_documents(documents=documents)
query_engine = index.as_query_engine()

# 二和三、使用`RagEvaluatorPack`进行预测和评估
RagEvaluatorPack = download_llama_pack("RagEvaluatorPack", "./pack")
rag_evaluator = RagEvaluatorPack(
    query_engine=query_engine,
    rag_dataset=rag_dataset,  # 在1A部分定义
    show_progress=True,
)

############################################################################
# 注意：如果使用的是OpenAI API的低级订阅（如使用级别1）#
# 则需要设置不同的batch_size和sleep_time_in_seconds参数。#
# 截至2023年12月，对于使用级别1，batch_size=5和sleep_time_in_seconds=15的配置效果较好。#
############################################################################

benchmark_df = await rag_evaluator_pack.arun(
    batch_size=20,  # 批量处理OpenAI API调用的数量
    sleep_time_in_seconds=1,  # API调用前的休眠秒数
)

In [ ]:

Copied!

benchmark_df
benchmark_df

输出[ ]:

检索增强生成	基础检索增强生成
指标
平均正确性得分	4.238636
平均相关性得分	0.977273
平均忠实度得分	1.000000
平均上下文相似度得分	0.942281

模板¶

In [ ]:

Copied!





from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.core.llama_pack import download_llama_pack

documents = SimpleDirectoryReader(  # Can use a different reader here.
    input_dir=<FILL-IN>  # Should read the same source files used to create
).load_data()            # the LabelledRagDataset of Step 1.
                       
index = VectorStoreIndex.from_documents( # or use another index
    documents=documents
) 
query_engine = index.as_query_engine()

RagEvaluatorPack = download_llama_pack(
  "RagEvaluatorPack", "./pack"
)
rag_evaluator = RagEvaluatorPack(
    query_engine=query_engine,
    rag_dataset=rag_dataset,  # defined in Step 1A
    judge_llm=<FILL-IN>  # if you rather not use GPT-4
)
benchmark_df = await rag_evaluator.arun()
benchmark_df
from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.core.llama_pack import download_llama_pack

documents = SimpleDirectoryReader(  # Can use a different reader here.
    input_dir=  # Should read the same source files used to create
).load_data()            # the LabelledRagDataset of Step 1.
                       
index = VectorStoreIndex.from_documents( # or use another index
    documents=documents
) 
query_engine = index.as_query_engine()

RagEvaluatorPack = download_llama_pack(
  "RagEvaluatorPack", "./pack"
)
rag_evaluator = RagEvaluatorPack(
    query_engine=query_engine,
    rag_dataset=rag_dataset,  # defined in Step 1A
    judge_llm=  # if you rather not use GPT-4
)
benchmark_df = await rag_evaluator.arun()
benchmark_df

返回顶部 ¶

3. 准备 `card.json` 和 `README.md`¶

提交数据集时也需要提交一些元数据。这些元数据存放在两个不同的文件中：card.json和README.md，这两个文件都会作为提交包的一部分上传到llama-hub Github仓库。为了加快这一步骤并确保一致性，您可以使用LlamaDatasetMetadataPack llamapack。或者，您也可以按照下面的演示和提供的模板手动完成此步骤。

3A. 使用`LlamaDatasetMetadataPack`自动生成¶

演示¶

这延续了1A部分关于保罗·格雷厄姆文章的演示示例。

In [ ]:

Copied!





from llama_index.core.llama_pack import download_llama_pack

LlamaDatasetMetadataPack = download_llama_pack(
    "LlamaDatasetMetadataPack", "./pack"
)

metadata_pack = LlamaDatasetMetadataPack()

dataset_description = (
    "A labelled RAG dataset based off an essay by Paul Graham, consisting of "
    "queries, reference answers, and reference contexts."
)

# this creates and saves a card.json and README.md to the same
# directory where you're running this notebook.
metadata_pack.run(
    name="Paul Graham Essay Dataset",
    description=dataset_description,
    rag_dataset=rag_dataset,
    index=index,
    benchmark_df=benchmark_df,
    baseline_name="llamaindex",
)
from llama_index.core.llama_pack import download_llama_pack

LlamaDatasetMetadataPack = download_llama_pack(
    "LlamaDatasetMetadataPack", "./pack"
)

metadata_pack = LlamaDatasetMetadataPack()

dataset_description = (
    "基于Paul Graham文章的标注RAG数据集，包含查询、参考答案和参考上下文。"
)

# 这将在运行此笔记本的同一目录下创建并保存card.json和README.md文件
metadata_pack.run(
    name="Paul Graham文章数据集",
    description=dataset_description,
    rag_dataset=rag_dataset,
    index=index,
    benchmark_df=benchmark_df,
    baseline_name="llamaindex",
)

In [ ]:

Copied!





# if you want to quickly view these two files, set take_a_peak to True
take_a_peak = False

if take_a_peak:
    import json

    with open("card.json", "r") as f:
        card = json.load(f)

    with open("README.md", "r") as f:
        readme_str = f.read()

    print(card)
    print("\n")
    print(readme_str)
# 如果想快速查看这两个文件，将take_a_peak设为True
take_a_peak = False

if take_a_peak:
    import json

    with open("card.json", "r") as f:
        card = json.load(f)

    with open("README.md", "r") as f:
        readme_str = f.read()

    print(card)
    print("\n")
    print(readme_str)

模板¶

In [ ]:

Copied!





from llama_index.core.llama_pack import download_llama_pack

LlamaDatasetMetadataPack = download_llama_pack(
  "LlamaDatasetMetadataPack", "./pack"
)

metadata_pack = LlamaDatasetMetadataPack()
metadata_pack.run(
    name=<FILL-IN>,
    description=<FILL-IN>,
    rag_dataset=rag_dataset,  # from step 1
    index=index,  # from step 2
    benchmark_df=benchmark_df,  # from step 2
    baseline_name="llamaindex",  # optionally use another one
    source_urls=<OPTIONAL-FILL-IN>
    code_url=<OPTIONAL-FILL-IN>  # if you wish to submit code to replicate baseline results
)
from llama_index.core.llama_pack import download_llama_pack

LlamaDatasetMetadataPack = download_llama_pack(
  "LlamaDatasetMetadataPack", "./pack"
)

metadata_pack = LlamaDatasetMetadataPack()
metadata_pack.run(
    name=,
    description=,
    rag_dataset=rag_dataset,  # from step 1
    index=index,  # from step 2
    benchmark_df=benchmark_df,  # from step 2
    baseline_name="llamaindex",  # optionally use another one
    source_urls=
    code_url=  # if you wish to submit code to replicate baseline results
)

运行上述代码后，您可以检查card.json和README.md文件，并在提交到llama-hub Github仓库前手动进行必要的编辑。

步骤4, 返回顶部 ¶

3B. 手动生成¶

在这一部分，我们将通过Paul Graham文章示例来演示如何创建card.json和README.md文件，这个示例我们在1A部分已经使用过（如果您在步骤1选择了1C选项也同样适用）。

`card.json`¶

演示¶

{
    "name": "Paul Graham Essay",
    "description": "A labelled RAG dataset based off an essay by Paul Graham, consisting of queries, reference answers, and reference contexts.",
    "numberObservations": 44,
    "containsExamplesByHumans": false,
    "containsExamplesByAI": true,
    "sourceUrls": [
        "http://www.paulgraham.com/articles.html"
    ],
    "baselines": [
        {
            "name": "llamaindex",
            "config": {
                "chunkSize": 1024,
                "llm": "gpt-3.5-turbo",
                "similarityTopK": 2,
                "embedModel": "text-embedding-ada-002"
            },
            "metrics": {
                "contextSimilarity": 0.934,
                "correctness": 4.239,
                "faithfulness": 0.977,
                "relevancy": 0.977
            },
            "codeUrl": "https://github.com/run-llama/llama-hub/blob/main/llama_hub/llama_datasets/paul_graham_essay/llamaindex_baseline.py"
        }
    ]
}

模板¶

{
    "name": <FILL-IN>,
    "description": <FILL-IN>,
    "numberObservations": <FILL-IN>,
    "containsExamplesByHumans": <FILL-IN>,
    "containsExamplesByAI": <FILL-IN>,
    "sourceUrls": [
        <FILL-IN>,
    ],
    "baselines": [
        {
            "name": <FILL-IN>,
            "config": {
                "chunkSize": <FILL-IN>,
                "llm": <FILL-IN>,
                "similarityTopK": <FILL-IN>,
                "embedModel": <FILL-IN>
            },
            "metrics": {
                "contextSimilarity": <FILL-IN>,
                "correctness": <FILL-IN>,
                "faithfulness": <FILL-IN>,
                "relevancy": <FILL-IN>
            },
            "codeUrl": <OPTIONAL-FILL-IN>
        }
    ]
}

`README.md`¶

在这一步骤中，最基本的要求是采用下方模板并填写必要项目，这相当于将数据集名称更改为您希望用于新提交的数据集名称。

演示¶

点击这里查看示例README.md。

模板¶

点击此处获取README.md模板文件。只需复制粘贴该文件内容，并根据您选择的新数据集名称，将占位符"[NAME]"和"[NAME-CAMELCASE]"替换为适当的值。例如：

"{NAME}" = "保罗·格雷厄姆文集数据集"
"{NAME_CAMELCASE}" = PaulGrahamEssayDataset

返回顶部 ¶

4. 向llama-hub仓库提交Pull Request¶

现在是提交新数据集元数据并在数据集注册表中创建新条目的时机，该注册表存储在文件library.json中（即查看此处）。

4a. 在`llama_hub/llama_datasets`目录下创建一个新目录，并添加你的`card.json`和`README.md`文件:¶

cd llama-hub  # cd into local clone of llama-hub
cd llama_hub/llama_datasets
git checkout -b my-new-dataset  # create a new git branch
mkdir <dataset_name_snake_case>  # follow convention of other datasets
cd <dataset_name_snake_case>
vim card.json # use vim or another text editor to add in the contents for card.json
vim README.md # use vim or another text editor to add in the contents for README.md

4b. 在`llama_hub/llama_datasets/library.json`中创建一个条目¶

cd llama_hub/llama_datasets
vim library.json # use vim or another text editor to register your new dataset

演示 `library.json`¶

"PaulGrahamEssayDataset": {
    "id": "llama_datasets/paul_graham_essay",
    "author": "nerdai",
    "keywords": ["rag"]
  }

`library.json`的模板¶

"<FILL-IN>": {
    "id": "llama_datasets/<dataset_name_snake_case>",
    "author": "<FILL-IN>",
    "keywords": ["rag"]
  }

注意: 请使用与4a中相同的dataset_name_snake_case。

4c. `git add` 并 `commit` 你的更改，然后推送到你的fork¶

git add .
git commit -m "my new dataset submission"
git push origin my-new-dataset

之后，前往llama-hub的Github页面。你应该能看到从你的fork创建pull request的选项。现在就继续操作吧。

返回顶部 ¶

5. 向llama-datasets仓库提交Pull Request¶

在提交流程的最后一步，您需要将实际的LabelledRagDataset(json格式)以及源数据文件提交到llama-datasets Github仓库。

5a. 在`llama_datasets/`目录下创建一个新文件夹:¶

cd llama-datasets # cd into local clone of llama-datasets
git checkout -b my-new-dataset  # create a new git branch
mkdir <dataset_name_snake_case>  # use the same name as used in Step 4.
cd <dataset_name_snake_case>
cp <path-in-local-machine>/rag_dataset.json .  # add rag_dataset.json
mkdir source_files  # time to add all of the source files
cp -r <path-in-local-machine>/source_files  ./source_files  # add all source files

注意: 请使用与步骤4中相同的dataset_name_snake_case。

5b. 使用`git add`和`commit`提交您的更改，然后推送到您的fork¶

git add .
git commit -m "my new dataset submission"
git push origin my-new-dataset

完成这一步后，前往llama-datasets的Github页面。你应该能看到从你的fork创建pull request的选项。现在就可以继续操作了。

返回顶部 ¶

瞧！¶

您已经完成了数据集提交流程的最后一步！🎉🦙 恭喜您，感谢您的贡献！

LlamaDataset 提交模板笔记本¶

先决条件¶

Fork并克隆所需的Github仓库¶

关于LabelledRagDataset和LabelledRagDataExample的快速入门指南¶

提交LlamaDataset的步骤¶

1A. 从头创建一个带有合成构建示例的LabelledRagDataset¶

演示¶

模板¶

步骤2, 返回顶部¶

1B. 从现有且结构相似的数据集创建LabelledRagDataset¶

演示¶

模板¶

步骤2, 返回顶部¶

1C. 从头开始创建带有手动构建示例的LabelledRagDataset¶

演示:¶

模板¶

返回顶部¶

2. 生成基准评估结果¶

演示¶

模板¶

返回顶部¶

3. 准备 card.json 和 README.md¶

3A. 使用LlamaDatasetMetadataPack自动生成¶

演示¶

模板¶

步骤4, 返回顶部¶

3B. 手动生成¶

card.json¶

演示¶

模板¶

README.md¶

演示¶

模板¶

返回顶部¶

4. 向llama-hub仓库提交Pull Request¶

4a. 在llama_hub/llama_datasets目录下创建一个新目录，并添加你的card.json和README.md文件:¶

4b. 在llama_hub/llama_datasets/library.json中创建一个条目¶

演示 library.json¶

library.json的模板¶

4c. git add 并 commit 你的更改，然后推送到你的fork¶

返回顶部¶

5. 向llama-datasets仓库提交Pull Request¶

5a. 在llama_datasets/目录下创建一个新文件夹:¶

5b. 使用git add和commit提交您的更改，然后推送到您的fork¶

返回顶部¶

瞧！¶

`LlamaDataset` 提交模板笔记本¶

关于`LabelledRagDataset`和`LabelledRagDataExample`的快速入门指南¶

提交`LlamaDataset`的步骤¶

1A. 从头创建一个带有合成构建示例的`LabelledRagDataset`¶

步骤2, 返回顶部 ¶

1B. 从现有且结构相似的数据集创建`LabelledRagDataset`¶

步骤2, 返回顶部 ¶

1C. 从头开始创建带有手动构建示例的`LabelledRagDataset`¶

返回顶部 ¶

返回顶部 ¶

3. 准备 `card.json` 和 `README.md`¶

3A. 使用`LlamaDatasetMetadataPack`自动生成¶

步骤4, 返回顶部 ¶

`card.json`¶

`README.md`¶

返回顶部 ¶

4a. 在`llama_hub/llama_datasets`目录下创建一个新目录，并添加你的`card.json`和`README.md`文件:¶

4b. 在`llama_hub/llama_datasets/library.json`中创建一个条目¶

演示 `library.json`¶

`library.json`的模板¶

4c. `git add` 并 `commit` 你的更改，然后推送到你的fork¶

返回顶部 ¶

5a. 在`llama_datasets/`目录下创建一个新文件夹:¶

5b. 使用`git add`和`commit`提交您的更改，然后推送到您的fork¶

返回顶部 ¶