安装依赖¶
要运行此Python文件,您需要安装以下依赖项:
- numpy
- pandas
- matplotlib
您可以使用以下命令来安装这些依赖项:
pip install numpy pandas matplotlib
In [ ]:
Copied!
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-agents-openai
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-agents-openai
In [ ]:
Copied!
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
初始化和设置LLM和本地嵌入模型¶
In [ ]:
Copied!
from llama_index.core.settings import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
Settings.llm = OpenAI()
from llama_index.core.settings import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
Settings.llm = OpenAI()
下载和索引数据¶
这是我们为了演示而做的事情。在生产环境中,数据存储和索引应该已经存在,而不是临时创建。
创建存储上下文¶
In [ ]:
Copied!
from llama_index.core import (
StorageContext,
load_index_from_storage,
)
try:
storage_context = StorageContext.from_defaults(
persist_dir="./storage/lyft",
)
lyft_index = load_index_from_storage(storage_context)
storage_context = StorageContext.from_defaults(
persist_dir="./storage/uber"
)
uber_index = load_index_from_storage(storage_context)
index_loaded = True
except:
index_loaded = False
from llama_index.core import (
StorageContext,
load_index_from_storage,
)
try:
storage_context = StorageContext.from_defaults(
persist_dir="./storage/lyft",
)
lyft_index = load_index_from_storage(storage_context)
storage_context = StorageContext.from_defaults(
persist_dir="./storage/uber"
)
uber_index = load_index_from_storage(storage_context)
index_loaded = True
except:
index_loaded = False
下载数据
In [ ]:
Copied!
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
加载数据
In [ ]:
Copied!
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
if not index_loaded:
lyft_docs = SimpleDirectoryReader(
input_files=["./data/10k/lyft_2021.pdf"]
).load_data()
uber_docs = SimpleDirectoryReader(
input_files=["./data/10k/uber_2021.pdf"]
).load_data()
# 构建索引
lyft_index = VectorStoreIndex.from_documents(lyft_docs)
uber_index = VectorStoreIndex.from_documents(uber_docs)
# 持久化索引
lyft_index.storage_context.persist(persist_dir="./storage/lyft")
uber_index.storage_context.persist(persist_dir="./storage/uber")
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
if not index_loaded:
lyft_docs = SimpleDirectoryReader(
input_files=["./data/10k/lyft_2021.pdf"]
).load_data()
uber_docs = SimpleDirectoryReader(
input_files=["./data/10k/uber_2021.pdf"]
).load_data()
# 构建索引
lyft_index = VectorStoreIndex.from_documents(lyft_docs)
uber_index = VectorStoreIndex.from_documents(uber_docs)
# 持久化索引
lyft_index.storage_context.persist(persist_dir="./storage/lyft")
uber_index.storage_context.persist(persist_dir="./storage/uber")
创建查询引擎¶
In [ ]:
Copied!
lyft_engine = lyft_index.as_query_engine(similarity_top_k=5)
uber_engine = uber_index.as_query_engine(similarity_top_k=5)
lyft_engine = lyft_index.as_query_engine(similarity_top_k=5)
uber_engine = uber_index.as_query_engine(similarity_top_k=5)
创建评估器¶
In [ ]:
Copied!
from llama_index.core.evaluation import RelevancyEvaluator
evaluator = RelevancyEvaluator()
from llama_index.core.evaluation import RelevancyEvaluator
evaluator = RelevancyEvaluator()
创建查询引擎工具¶
In [ ]:
Copied!
from llama_index.core.tools import ToolMetadata
from llama_index.core.tools.eval_query_engine import EvalQueryEngineTool
query_engine_tools = [
EvalQueryEngineTool(
evaluator=evaluator,
query_engine=lyft_engine,
metadata=ToolMetadata(
name="lyft",
description=(
"Provides information about Lyft's financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
),
EvalQueryEngineTool(
evaluator=evaluator,
query_engine=uber_engine,
metadata=ToolMetadata(
name="uber",
description=(
"Provides information about Uber's financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
),
]
from llama_index.core.tools import ToolMetadata
from llama_index.core.tools.eval_query_engine import EvalQueryEngineTool
query_engine_tools = [
EvalQueryEngineTool(
evaluator=evaluator,
query_engine=lyft_engine,
metadata=ToolMetadata(
name="lyft",
description=(
"Provides information about Lyft's financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
),
EvalQueryEngineTool(
evaluator=evaluator,
query_engine=uber_engine,
metadata=ToolMetadata(
name="uber",
description=(
"Provides information about Uber's financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
),
]
import gym
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
In [ ]:
Copied!
from llama_index.agent.openai import OpenAIAgent
agent = OpenAIAgent.from_tools(query_engine_tools, verbose=True)
from llama_index.agent.openai import OpenAIAgent
agent = OpenAIAgent.from_tools(query_engine_tools, verbose=True)
查询引擎失败评估¶
为了演示目的,我们将告诉代理先选择错误的工具,以便我们可以观察当评估失败时 EvalQueryEngineTool
的影响。为了实现这一点,在调用代理时,我们将 tool_choice
设置为 lyft
。
这是我们预期会发生的事情:
- 代理将首先使用包含错误财务数据的
lyft
工具,因为我们已经指示它这样做 EvalQueryEngineTool
将使用其评估器评估查询引擎的响应- 查询引擎的输出将因为包含 Lyft 的财务数据而无法通过评估
- 该工具将形成一个响应,通知代理无法使用该工具,并给出原因
- 代理将回退到第二个工具,即
uber
- 第二个工具的查询引擎输出将通过评估,因为它包含 Uber 的财务数据
- 代理将以答复的形式做出回应
In [ ]:
Copied!
response = await agent.achat(
"What was Uber's revenue growth in 2021?", tool_choice="lyft"
)
print(str(response))
response = await agent.achat(
"What was Uber's revenue growth in 2021?", tool_choice="lyft"
)
print(str(response))
Added user message to memory: What was Uber's revenue growth in 2021? === Calling Function === Calling function: lyft with args: {"input":"What was Uber's revenue growth in 2021?"} Got output: Could not use tool lyft because it failed evaluation. Reason: NO ======================== === Calling Function === Calling function: uber with args: {"input":"What was Uber's revenue growth in 2021?"} Got output: Uber's revenue grew by 57% in 2021. ======================== Uber's revenue grew by 57% in 2021.
查询引擎通过评估¶
在这里,我们正在询问有关Lyft财务状况的问题。我们应该期望发生以下情况:
- 代理将首先使用
lyftk
工具,仅基于其描述,因为我们在这里没有设置tool_choice
EvalQueryEngineTool
将使用其评估器评估查询引擎的响应- 查询引擎的输出将通过评估,因为它包含Lyft的财务状况
In [ ]:
Copied!
response = await agent.achat("What was Lyft's revenue growth in 2021?")
print(str(response))
response = await agent.achat("What was Lyft's revenue growth in 2021?")
print(str(response))
Added user message to memory: What was Lyft's revenue growth in 2021? === Calling Function === Calling function: lyft with args: {"input": "What was Lyft's revenue growth in 2021?"} Got output: Lyft's revenue growth in 2021 was $3,208,323, which increased compared to the revenue in 2020 and 2019. ======================== === Calling Function === Calling function: uber with args: {"input": "What was Lyft's revenue growth in 2021?"} Got output: Could not use tool uber because it failed evaluation. Reason: NO ======================== Lyft's revenue grew by $3,208,323 in 2021, which increased compared to the revenue in 2020 and 2019.