如何使用LangChain构建一个使用工具的代理
本笔记将带领您了解如何使用LangChain来增强OpenAI模型,使其能够访问外部工具。特别是,您将能够创建使用自定义工具来回答用户查询的LLM代理。
什么是LangChain?
LangChain 是一个用于开发由语言模型驱动的应用程序的框架。他们的框架使您能够构建具有上下文感知能力的分层LLM驱动应用程序,这些应用程序能够作为代理与其环境动态交互,从而简化您的代码,并为您的客户提供更具动态性的用户体验。
为什么LLMs需要使用工具?
LLMs面临的最常见挑战之一是克服其训练数据的缺乏时效性和特定性 - 答案可能过时,并且由于其知识库的巨大多样性,它们容易产生幻觉。工具是让LLM能够在受控环境中回答问题的一个很好的方法,这些环境利用您现有的知识库和内部API - 而不是试图将LLM一直引导到您预期的答案,您允许它访问工具,动态调用信息,解析并提供给客户。
为LLMs提供工具访问权限可以使它们直接从搜索引擎、API或您自己的数据库中获取上下文相关的答案。LLM具有工具访问权限后,可以执行中间步骤来收集相关信息,而不是直接回答问题。工具也可以组合使用。例如,可以让语言模型使用搜索工具查找定量信息,并使用计算器执行计算。
笔记本章节
- 设置: 导入包并连接到Pinecone向量数据库。
- LLM代理: 构建一个代理,利用修改版的ReAct框架进行思维链推理。
- 带历史记录的LLM代理: 为LLM提供对对话中先前步骤的访问权限。
- 知识库: 创建一 个“你应该知道的事情”播客剧集的知识库,以便通过工具访问。
- 带工具的LLM代理: 扩展代理以访问多个工具,并测试它是否使用这些工具来回答问题。
%load_ext autoreload
%autoreload 2
The autoreload extension is already loaded. To reload it, use:
%reload_ext autoreload
设置
导入库并建立与Pinecone向量数据库的连接。
您可以将Pinecone替换为任何其他vectorstore或数据库 - Langchain原生支持一些选择,而其他连接器将需要您自己开发。
!pip install openai
!pip install pinecone-client
!pip install pandas
!pip install typing
!pip install tqdm
!pip install langchain
!pip install wget
import datetime
import json
import openai
import os
import pandas as pd
import pinecone
import re
from tqdm.auto import tqdm
from typing import List, Union
import zipfile
# Langchain 导入
from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent, AgentOutputParser
from langchain.prompts import BaseChatPromptTemplate, ChatPromptTemplate
from langchain import SerpAPIWrapper, LLMChain
from langchain.schema import AgentAction, AgentFinish, HumanMessage, SystemMessage
# 大型语言模型封装器
from langchain.chat_models import ChatOpenAI
from langchain import OpenAI
# 对话记忆
from langchain.memory import ConversationBufferWindowMemory
# 嵌入与向量存储
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Pinecone
# 向量存储索引
index_name = 'podcasts'
要获取一个用于连接Pinecone的API密钥,您可以设置一个免费账户,并将其存储在下面的api_key
变量中,或者存储在环境变量中,名称为PINECONE_API_KEY
。
api_key = os.getenv("PINECONE_API_KEY") or "PINECONE_API_KEY"
# 在Pinecone控制台中找到您的API密钥旁边的环境
env = os.getenv("PINECONE_ENVIRONMENT") or "PINECONE_ENVIRONMENT"
pinecone.init(api_key=api_key, environment=env)
pinecone.whoami()
pinecone.list_indexes()
['podcasts']
如果要清除索引,或者索引尚不存在,请运行此代码块
# 检查是否已存在同名索引 - 如果是,则删除
if index_name in pinecone.list_indexes():
pinecone.delete_index(index_name)
# 创建新索引
pinecone.create_index(name=index_name, dimension=1536)
index = pinecone.Index(index_name=index_name)
# 确认我们的索引已创建
pinecone.list_indexes()
LLM代理
Langchain中的LLM代理具有许多可配置的组件,这些组件在Langchain文档中有详细说明。
我们 将运用一些核心概念来创建一个代理,使其按照我们希望的方式进行对话,可以使用工具来回答问题,并使用适当的语言模型来驱动对话。 -
提示模板: 控制LLM行为的输入模板,以及它如何接受输入并生成输出 -
这是驱动应用程序的大脑(文档)。 -
输出解析器:
一种解析LLM输出的方法。如果LLM使用特定标头生成输出,您可以启用复杂的交互,其中变量由LLM在其响应中生成,并传递到链条的下一步骤中(文档)。 -
LLM链条: 一个链条将提示模板与将执行它的LLM结合在一起 -
在本例中,我们将使用gpt-3.5-turbo
,但这个框架也可以与OpenAI完成模型或其他完全不同的LLM一起使用(文档)。 -
工具:
LLM可以使用的外部服务,用于检索信息或执行命令,如果用户需要的话(文档)。 -
代理:
将所有这些组件结合在一起的粘合剂,代理可以调用多个LLM链条,每个链条都有自己的工具。代理可以通过您自己的逻辑进行扩展,以允许重试、错误处理和您选择添加的任何其他方法,以增加应用程序的可靠性(文档)。
注意: 在使用此手册与搜索工具之前,您需要在 https://serpapi.com/
上注册并生成一个API密钥。一旦您获得了API密钥,请将其存储在名为SERPAPI_API_KEY
的环境变量中。
# 启动搜索工具 - 请注意,您需要按照上述说明将 SERPAPI_API_KEY 设置为环境变量。
search = SerpAPIWrapper()
# 定义一个工具列表
tools = [
Tool(
name = "Search",
func=search.run,
description="useful for when you need to answer questions about current events"
)
]
# 为工具、用户输入和模型记录其工作过程的便签板设置输入变量提示。
template = """Answer the following questions as best you can, but speaking as a pirate might speak. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin! Remember to speak as a pirate when giving your final answer. Use lots of "Arg"s
Question: {input}
{agent_scratchpad}"""
# 创建一个提示模板
class CustomPromptTemplate(BaseChatPromptTemplate):
# 所使用的模板
template: str
# 可用工具清单
tools: List[Tool]
def format_messages(self, **kwargs) -> str:
# 获取中间步骤(AgentAction、Observation 元组)
# 以特定方式格式化它们
intermediate_steps = kwargs.pop("intermediate_steps")
thoughts = ""
for action, observation in intermediate_steps:
thoughts += action.log
thoughts += f"\nObservation: {observation}\nThought: "
# 将 `agent_scratchpad` 变量设置为该值。
kwargs["agent_scratchpad"] = thoughts
# 从提供的工具列表中创建一 个名为 tools 的变量。
kwargs["tools"] = "\n".join([f"{tool.name}: {tool.description}" for tool in self.tools])
# 创建一个所提供工具的名称列表
kwargs["tool_names"] = ", ".join([tool.name for tool in self.tools])
formatted = self.template.format(**kwargs)
return [HumanMessage(content=formatted)]
prompt = CustomPromptTemplate(
template=template,
tools=tools,
# 这省略了 `agent_scratchpad`、`tools` 和 `tool_names` 变量,因为这些是动态生成的。
# 这包括了 `intermediate_steps` 变量,因为它是必需的。
input_variables=["input", "intermediate_steps"]
)
class CustomOutputParser(AgentOutputParser):
def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:
# 检查代理是否应结束
if "Final Answer:" in llm_output:
return AgentFinish(
# 返回值通常总是一个包含单个 `output` 键的字典。
# 目前不建议尝试其他任何操作 :)
return_values={"output": llm_output.split("Final Answer:")[-1].strip()},
log=llm_output,
)
# 解析出动作和动作输入
regex = r"Action: (.*?)[\n]*Action Input:[\s]*(.*)"
match = re.search(regex, llm_output, re.DOTALL)
# 如果无法解析输出,它会引发一个错误
# 您可以在这里添加自己的逻辑来以不同方式处理错误,例如传递给人工处理,给出一个标准回复
if not match:
raise ValueError(f"Could not parse LLM output: `{llm_output}`")
action = match.group(1).strip()
action_input = match.group(2)
# 返回动作和动作输入
return AgentAction(tool=action, tool_input=action_input.strip(" ").strip('"'), log=llm_output)
output_parser = CustomOutputParser()
# 启动我们的LLM - 默认设置为 'gpt-3.5-turbo'
llm = ChatOpenAI(temperature=0)
# 由大型语言模型(LLM)和提示组成的LLM链
llm_chain = LLMChain(llm=llm, prompt=prompt)
# 利用工具,LLM链和输出解析器来构建一个智能体
tool_names = [tool.name for tool in tools]
agent = LLMSingleActionAgent(
llm_chain=llm_chain,
output_parser=output_parser,
# We use "Observation" as our stop sequence so it will stop when it receives Tool output
# 如果你更改了提示模板,你也需要相应地进行调整。
stop=["\nObservation:"],
allowed_tools=tool_names
)
# 启动将响应我们查询的代理程序。
# 将 verbose 设置为 True,以分享 LLM 进行 CoT 推理的过程。
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)
agent_executor.run("How many people live in canada as of 2023?")
> Entering new AgentExecutor chain...
Thought: Hmm, I be not sure of the answer to that one. Let me think.
Action: Search
Action Input: "Canada population 2023"
Observation:39,566,248Ahoy, that be a lot of people! But I need to make sure this be true.
Action: Search
Action Input: "Canada population 2023 official source"
Observation:The current population of Canada is 38,664,637 as of Wednesday, April 19, 2023, based on Worldometer elaboration of the latest United Nations data.Arrr, that be the official number! I be confident in me answer now.
Final Answer: The population of Canada as of 2023 is 38,664,637. Arg!
> Finished chain.
'The population of Canada as of 2023 is 38,664,637. Arg!'
agent_executor.run("How many in 2022?")
> Entering new AgentExecutor chain...
Thought: Hmm, I'm not sure what this question is asking about. I better use the search tool.
Action: Search
Action Input: "2022 events"
Observation:8. Humanitarian Crises Deepen · 7. Latin America Moves Left. · 6. Iranians Protest. · 5. COVID Eases. · 4. Inflation Returns. · 3. Climate Change ...Ahoy, it looks like this be a question about what be happenin' in 2022. Let me search again.
Action: Search
Action Input: "2022 calendar"
Observation:United States 2022 – Calendar with American holidays. Yearly calendar showing months for the year 2022. Calendars – online and print friendly – for any year ...Shiver me timbers, it looks like this be a question about the year 2022. Let me search one more time.
Action: Search
Action Input: "What be happenin' in 2022?"
Observation:8. Humanitarian Crises Deepen · 7. Latin America Moves Left. · 6. Iranians Protest. · 5. COVID Eases. · 4. Inflation Returns. · 3. Climate Change ...Avast ye, it looks like the same results be comin' up. I reckon there be no clear answer to this question.
Final Answer: Arg, I be sorry matey, but I can't give ye a clear answer to that question.
> Finished chain.
"Arg, I be sorry matey, but I can't give ye a clear answer to that question."
带有历史记录的LLM代理
扩展LLM代理的功能,使其能够保留内存,并在继续对话时将其用作上下文。
在本示例中,我们使用一个简单的ConversationBufferWindowMemory
,它保留最近两个对话轮的滚动窗口。LangChain还提供其他内存选项,适用于不同用例的不同权衡。
# 创建一个提示模板,能够插入历史记录
template_with_history = """You are SearchGPT, a professional search engine who provides informative answers to users. Answer the following questions as best you can. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin! Remember to give detailed, informative answers
Previous conversation history:
{history}
New question: {input}
{agent_scratchpad}"""
prompt_with_history = CustomPromptTemplate(
template=template_with_history,
tools=tools,
# The history template includes "history" as an input variable so we can interpolate it into the prompt
input_variables=["input", "intermediate_steps", "history"]
)
llm_chain = LLMChain(llm=llm, prompt=prompt_with_history)
tool_names = [tool.name for tool in tools]
agent = LLMSingleActionAgent(
llm_chain=llm_chain,
output_parser=output_parser,
stop=["\nObservation:"],
allowed_tools=tool_names
)
# 以k=2初始化记忆,保留最近两次的回合。
# 将记忆提供给代理
memory = ConversationBufferWindowMemory(k=2)
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True, memory=memory)
agent_executor.run("How many people live in canada as of 2023?")
> Entering new AgentExecutor chain...
Thought: I need to find the most recent population data for Canada.
Action: Search
Action Input: "Canada population 2023"
Observation:39,566,248This data seems reliable, but I should double-check the source.
Action: Search
Action Input: "Source of Canada population 2023"
Observation:The current population of Canada is 38,664,637 as of Wednesday, April 19, 2023, based on Worldometer elaboration of the latest United Nations data. Canada 2020 population is estimated at 37,742,154 people at mid year according to UN data. Canada population is equivalent to 0.48% of the total world population.I now know the final answer
Final Answer: As of April 19, 2023, the population of Canada is 38,664,637.
> Finished chain.
'As of April 19, 2023, the population of Canada is 38,664,637.'
agent_executor.run("how about in mexico?")
> Entering new AgentExecutor chain...
Thought: I need to search for the current population of Mexico.
Action: Search
Action Input: "current population of Mexico"
Observation:Mexico, officially the United Mexican States, is a country in the southern portion of North America. It is bordered to the north by the United States; to the south and west by the Pacific Ocean; to the southeast by Guatemala, Belize, and the Caribbean Sea; and to the east by the Gulf of Mexico.That's not the answer to the question, I need to refine my search.
Action: Search
Action Input: "population of Mexico 2023"
Observation:132,709,512I now know the final answer.
Final Answer: As of 2023, the population of Mexico is 132,709,512.
> Finished chain.
'As of 2023, the population of Mexico is 132,709,512.'
知识库
创建一个自定义的向量存储,供Agent用作回答问题的工具。我们将把结果存储在Pinecone中,Pinecone受LangChain支持(文档,API参考)。如果需要关于如何开始使用Pinecone或其他向量数据库的帮助,我们有一个食谱可帮助您入门。
您可以查看LangChain文档,了解其他可用的向量存储和数据库。
在这个示例中,我们将使用《Stuff You Should Know》播客的文字记录,这些记录是由OSF DOI 10.17605/OSF.IO/VM9NT提供的。
import wget
# 以下是一个包含转录播客的压缩文件的网址。
# Note that this data has already been split into chunks and embeddings from OpenAI's `text-embedding-3-small` embedding model are included
content_url = 'https://cdn.openai.com/API/examples/data/sysk_podcast_transcripts_embedded.json.zip'
# 下载文件(文件大小约为541MB,因此需要一些时间)
wget.download(content_url)
100% [......................................................................] 571275039 / 571275039
'sysk_podcast_transcripts_embedded.json.zip'