使用LabelledRagDatatset
对RAG管道进行基准测试¶
LabelledRagDataset
旨在用于评估任何给定的RAG管道,对于这种情况可能会有多种配置(例如选择LLM
,similarity_top_k
的值,chunk_size
等)。我们将这种抽象类比于传统的机器学习数据集,其中X
特征用于预测地面真实标签y
。在这种情况下,我们使用query
以及检索到的contexts
作为“特征”,并将查询的答案称为reference_answer
作为地面真实标签。
当然,这样的数据集由观察或示例组成。在LabelledRagDataset
的情况下,这些由一组LabelledRagDataExample
组成。
在本笔记本中,我们将展示如何从头开始构建一个LabelledRagDataset
。请注意,另一种方法是简单地从llama-hub
下载社区提供的LabelledRagDataset
,以便在其上评估/基准测试您自己的RAG管道。
LabelledRagDataExample
类¶
%pip install llama-index-llms-openai
%pip install llama-index-readers-wikipedia
from llama_index.core.llama_dataset import (
LabelledRagDataExample,
CreatedByType,
CreatedBy,
)
# 构造一个LabelledRagDataExample
query = "这是一个测试查询,不是吗?"
query_by = CreatedBy(type=CreatedByType.AI, model_name="gpt-4")
reference_answer = "是的,它是。"
reference_answer_by = CreatedBy(type=CreatedByType.HUMAN)
reference_contexts = ["这是一个示例上下文"]
rag_example = LabelledRagDataExample(
query=query,
query_by=query_by,
reference_contexts=reference_contexts,
reference_answer=reference_answer,
reference_answer_by=reference_answer_by,
)
LabelledRagDataExample
是一个Pydantic Model
,因此可以从json
或dict
进行相互转换。
print(rag_example.json())
{"query": "This is a test query, is it not?", "query_by": {"model_name": "gpt-4", "type": "ai"}, "reference_contexts": ["This is a sample context"], "reference_answer": "Yes it is.", "reference_answer_by": {"model_name": "", "type": "human"}}
LabelledRagDataExample.parse_raw(rag_example.json())
LabelledRagDataExample(query='This is a test query, is it not?', query_by=CreatedBy(model_name='gpt-4', type=<CreatedByType.AI: 'ai'>), reference_contexts=['This is a sample context'], reference_answer='Yes it is.', reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))
rag_example.dict()
{'query': 'This is a test query, is it not?', 'query_by': {'model_name': 'gpt-4', 'type': <CreatedByType.AI: 'ai'>}, 'reference_contexts': ['This is a sample context'], 'reference_answer': 'Yes it is.', 'reference_answer_by': {'model_name': '', 'type': <CreatedByType.HUMAN: 'human'>}}
LabelledRagDataExample.parse_obj(rag_example.dict())
LabelledRagDataExample(query='This is a test query, is it not?', query_by=CreatedBy(model_name='gpt-4', type=<CreatedByType.AI: 'ai'>), reference_contexts=['This is a sample context'], reference_answer='Yes it is.', reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))
让我们创建第二个示例,这样我们就可以拥有一个(稍微)更有趣的LabelledRagDataset
。
query = "This is a test query, is it so?"
reference_answer = "I think yes, it is."
reference_contexts = ["This is a second sample context"]
rag_example_2 = LabelledRagDataExample(
query=query,
query_by=query_by,
reference_contexts=reference_contexts,
reference_answer=reference_answer,
reference_answer_by=reference_answer_by,
)
LabelledRagDataset
类¶
from llama_index.core.llama_dataset import LabelledRagDataset
rag_dataset = LabelledRagDataset(examples=[rag_example, rag_example_2])
存在一种方便的方法可以将数据集查看为 pandas.DataFrame
。
rag_dataset.to_pandas()
query | reference_contexts | reference_answer | reference_answer_by | query_by | |
---|---|---|---|---|---|
0 | This is a test query, is it not? | [This is a sample context] | Yes it is. | human | ai (gpt-4) |
1 | This is a test query, is it so? | [This is a second sample context] | I think yes, it is. | human | ai (gpt-4) |
序列化¶
为了将数据集持久化到磁盘并从磁盘加载数据集,可以使用save_json
和from_json
方法。
rag_dataset.save_json("rag_dataset.json")
reload_rag_dataset = LabelledRagDataset.from_json("rag_dataset.json")
reload_rag_dataset.to_pandas()
query | reference_contexts | reference_answer | reference_answer_by | query_by | |
---|---|---|---|---|---|
0 | This is a test query, is it not? | [This is a sample context] | Yes it is. | human | ai (gpt-4) |
1 | This is a test query, is it so? | [This is a second sample context] | I think yes, it is. | human | ai (gpt-4) |
在维基百科上构建一个合成的LabelledRagDataset
¶
在这一部分中,我们将首先使用一个合成生成器创建一个LabelledRagDataset
。最终,我们将使用GPT-4来为合成的LabelledRagDataExample
生成query
和reference_answer
。
注意:如果有关于文本语料库的查询、参考答案和上下文,那么不需要使用数据合成来预测和随后评估这些预测结果。
import nest_asyncio
nest_asyncio.apply()
!pip install wikipedia -q
# 维基百科页面
from llama_index.readers.wikipedia import WikipediaReader
from llama_index.core import VectorStoreIndex
cities = [
"旧金山",
]
documents = WikipediaReader().load_data(
pages=[f"{x}的历史" for x in cities]
)
index = VectorStoreIndex.from_documents(documents)
RagDatasetGenerator
可以基于一组文档来生成LabelledRagDataExample
。
# 生成针对块的问题
from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.llms.openai import OpenAI
# 为llm提供者设置上下文
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.3)
# 实例化一个DatasetGenerator
dataset_generator = RagDatasetGenerator.from_documents(
documents,
llm=llm,
num_questions_per_chunk=2, # 设置每个节点的问题数量
show_progress=True,
)
Parsing nodes: 0%| | 0/1 [00:00<?, ?it/s]
len(dataset_generator.nodes)
13
# 由于有13个节点,应该总共有26个问题
rag_dataset = dataset_generator.generate_dataset_from_nodes()
100%|███████████████████████████████████████████████████████| 13/13 [00:02<00:00, 5.04it/s] 100%|█████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.14s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.95s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:13<00:00, 6.55s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:07<00:00, 3.89s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.66s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.85s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:04<00:00, 2.03s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:08<00:00, 4.07s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:06<00:00, 3.48s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:04<00:00, 2.34s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.50s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:08<00:00, 4.35s/it] 100%|█████████████████████████████████████████████████████████| 2/2 [00:08<00:00, 4.34s/it]
rag_dataset.to_pandas()
query | reference_contexts | reference_answer | reference_answer_by | query_by | |
---|---|---|---|---|---|
0 | How did the gold rush of 1849 impact the devel... | [The history of the city of San Francisco, Cal... | The gold rush of 1849 had a significant impact... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
1 | What were the early European settlements estab... | [The history of the city of San Francisco, Cal... | The early European settlements established in ... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
2 | How did the arrival of Europeans impact the se... | [== Arrival of Europeans and early settlement ... | The arrival of Europeans had a significant imp... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
3 | What were some of the challenges faced by the ... | [== Arrival of Europeans and early settlement ... | The early settlers of San Francisco faced seve... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
4 | How did the California gold rush impact the po... | [== 1848 gold rush ==\nThe California gold rus... | The California gold rush in the mid-19th centu... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
5 | Discuss the role of Chinese immigrants in the ... | [== 1848 gold rush ==\nThe California gold rus... | Chinese immigrants played a significant role i... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
6 | How did San Francisco transform into a major c... | [== Paris of the West ==\n\nIt was during the ... | San Francisco transformed into a major city du... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
7 | What were some significant developments and ch... | [== Paris of the West ==\n\nIt was during the ... | During the late 19th and early 20th centuries,... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
8 | How did Abe Ruef contribute to Eugene Schmitz'... | [== Corruption and graft trials ==\n\nMayor Eu... | Abe Ruef contributed $16,000 to Eugene Schmitz... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
9 | Describe the impact of the 1906 earthquake and... | [== Corruption and graft trials ==\n\nMayor Eu... | The 1906 earthquake and fire had a devastating... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
10 | How did the 1906 San Francisco earthquake impa... | [=== Reconstruction ===\nAlmost immediately af... | The 1906 San Francisco earthquake had a signif... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
11 | What major events and developments took place ... | [=== Reconstruction ===\nAlmost immediately af... | During the 1930s and World War II, several maj... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
12 | How did the post-World War II era contribute t... | [== Post-World War II ==\nAfter World War II, ... | After World War II, many American military per... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
13 | Discuss the impact of urban renewal initiative... | [== Post-World War II ==\nAfter World War II, ... | M. Justin Herman led urban renewal initiatives... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
14 | How did San Francisco become a center of count... | [== 1960 – 1970s ==\n\n\n=== "Summer of Love" ... | San Francisco became a center of countercultur... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
15 | Explain the role of San Francisco as a "Gay Me... | [== 1960 – 1970s ==\n\n\n=== "Summer of Love" ... | During the 1960s and beyond, San Francisco bec... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
16 | How did the construction of BART and Muni impa... | [=== New public infrastructure ===\nThe 1970s ... | The construction of BART and Muni in the 1970s... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
17 | What were the major challenges faced by San Fr... | [=== New public infrastructure ===\nThe 1970s ... | In the 1980s, San Francisco faced several majo... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
18 | How did the 1989 Loma Prieta earthquake impact... | [=== 1989 Loma Prieta earthquake ===\n\nOn Oct... | The 1989 Loma Prieta earthquake had significan... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
19 | Discuss the effects of the dot-com boom in the... | [=== 1989 Loma Prieta earthquake ===\n\nOn Oct... | The dot-com boom in the late 1990s had signifi... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
20 | How did the redevelopment of the Mission Bay n... | [== 2010s ==\nThe early 2000s and into the 201... | The redevelopment of the Mission Bay neighborh... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
21 | What significant events occurred in San Franci... | [== 2010s ==\nThe early 2000s and into the 201... | In 2010, the San Francisco Giants won their fi... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
22 | In the context of San Francisco's history, dis... | [=== Cultural themes ===\nBerglund, Barbara (2... | The 1906 earthquake had a significant impact o... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
23 | How did different ethnic and religious communi... | [=== Cultural themes ===\nBerglund, Barbara (2... | Two specific communities mentioned in the sour... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
24 | In the context of San Francisco's history, wha... | [=== Gold rush & early days ===\nHittell, John... | Some significant events and developments durin... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
25 | How did politics shape the growth and transfor... | [=== Gold rush & early days ===\nHittell, John... | The provided sources offer a comprehensive und... | ai (gpt-3.5-turbo) | ai (gpt-3.5-turbo) |
rag_dataset.save_json("rag_dataset.json")