文本摘要
本教程演示了如何使用内置链和LangGraph进行文本摘要。
此页面的先前版本展示了旧版链StuffDocumentsChain、MapReduceDocumentsChain和RefineDocumentsChain。有关使用这些抽象的信息以及与本教程中演示的方法的比较,请参见此处。
假设你有一组文档(PDF、Notion页面、客户问题等),并且你想总结这些内容。
鉴于LLMs在理解和合成文本方面的熟练程度,它们是实现这一目标的绝佳工具。
在检索增强生成的背景下,总结文本可以帮助提炼大量检索文档中的信息,为大型语言模型(LLM)提供上下文。
在本教程中,我们将介绍如何使用LLMs从多个文档中总结内容。
概念
我们将涵盖的概念有:
-
使用 语言模型。
-
使用文档加载器,特别是WebBaseLoader从HTML网页加载内容。
-
两种方式来总结或合并文档。
- Stuff,简单地将文档连接成一个提示;
- Map-reduce,适用于较大的文档集。这将文档分成批次,总结这些批次,然后总结这些总结。
关于这些策略及其他策略的简短、针对性指南,包括迭代优化,可以在操作指南中找到。
设置
Jupyter 笔记本
本指南(以及文档中的大多数其他指南)使用Jupyter notebooks,并假设读者也是如此。Jupyter notebooks 非常适合学习如何使用 LLM 系统,因为很多时候可能会出现意外情况(意外的输出、API 宕机等),在交互式环境中浏览指南是更好地理解它们的好方法。
本教程及其他教程或许在Jupyter笔记本中运行最为方便。有关如何安装的说明,请参见这里。
安装
要安装LangChain,请运行:
- Pip
- Conda
pip install langchain
conda install langchain -c conda-forge
更多详情,请参阅我们的安装指南。
LangSmith
使用LangChain构建的许多应用程序将包含多个步骤,涉及多次LLM调用。随着这些应用程序变得越来越复杂,能够检查链或代理内部究竟发生了什么变得至关重要。实现这一点的最佳方法是使用LangSmith。
在您通过上述链接注册后,请确保设置您的环境变量以开始记录跟踪:
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="..."
或者,如果在笔记本中,您可以通过以下方式设置它们:
import getpass
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()
概述
构建摘要生成器的一个核心问题是如何将文档传递到LLM的上下文窗口中。对此,两种常见的方法是:
-
Stuff
: 简单地将所有文档“塞入”一个提示中。这是最简单的方法(有关用于此方法的create_stuff_documents_chain
构造函数的更多信息,请参见这里)。 -
Map-reduce
: 在“map”步骤中单独总结每个文档,然后在“reduce”步骤中将摘要汇总为最终摘要(有关用于此方法的MapReduceDocumentsChain
的更多信息,请参见这里)。
请注意,当理解子文档不依赖于前面的上下文时,map-reduce 特别有效。例如,在总结许多较短的文档时。在其他情况下,例如总结具有固有顺序的小说或文本体时,迭代优化可能更有效。
设置
首先设置环境变量并安装包:
%pip install --upgrade --quiet tiktoken langchain langgraph beautifulsoup4 langchain-community
# Set env var OPENAI_API_KEY or load from a .env file
# import dotenv
# dotenv.load_dotenv()
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
首先我们加载我们的文档。我们将使用WebBaseLoader来加载一篇博客文章:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()
接下来让我们选择一个LLM:
pip install -qU langchain-openai
import getpass
import os
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
内容:在单个LLM调用中总结
我们可以使用create_stuff_documents_chain,特别是在使用较大的上下文窗口模型时,例如:
- 128k token OpenAI
gpt-4o
- 200k token Anthropic
claude-3-5-sonnet-20240620
链将接收一个文档列表,将它们全部插入到一个提示中,并将该提示传递给一个LLM:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.llm import LLMChain
from langchain_core.prompts import ChatPromptTemplate
# Define prompt
prompt = ChatPromptTemplate.from_messages(
[("system", "Write a concise summary of the following:\\n\\n{context}")]
)
# Instantiate chain
chain = create_stuff_documents_chain(llm, prompt)
# Invoke chain
result = chain.invoke({"context": docs})
print(result)
The article "LLM Powered Autonomous Agents" by Lilian Weng discusses the development and capabilities of autonomous agents powered by large language models (LLMs). It outlines a system architecture that includes three main components: Planning, Memory, and Tool Use.
1. **Planning** involves task decomposition, where complex tasks are broken down into manageable subgoals, and self-reflection, allowing agents to learn from past actions to improve future performance. Techniques like Chain of Thought (CoT) and Tree of Thoughts (ToT) are highlighted for enhancing reasoning and planning.
2. **Memory** is categorized into short-term and long-term memory, with mechanisms for fast retrieval using Maximum Inner Product Search (MIPS) algorithms. This allows agents to retain and recall information effectively.
3. **Tool Use** enables agents to interact with external APIs and tools, enhancing their capabilities beyond the limitations of their training data. Examples include MRKL systems and frameworks like HuggingGPT, which facilitate task planning and execution.
The article also addresses challenges such as finite context length, difficulties in long-term planning, and the reliability of natural language interfaces. It concludes with case studies demonstrating the practical applications of these concepts in scientific discovery and interactive simulations. Overall, the article emphasizes the potential of LLMs as powerful problem solvers in autonomous agent systems.
流处理
请注意,我们也可以逐令牌流式传输结果:
for token in chain.stream({"context": docs}):
print(token, end="|")
|The| article| "|LL|M| Powered| Autonomous| Agents|"| by| Lil|ian| W|eng| discusses| the| development| and| capabilities| of| autonomous| agents| powered| by| large| language| models| (|LL|Ms|).| It| outlines| a| system| architecture| that| includes| three| main| components|:| Planning|,| Memory|,| and| Tool| Use|.|
|1|.| **|Planning|**| involves| task| decomposition|,| where| complex| tasks| are| broken| down| into| manageable| sub|go|als|,| and| self|-ref|lection|,| allowing| agents| to| learn| from| past| actions| to| improve| future| performance|.| Techniques| like| Chain| of| Thought| (|Co|T|)| and| Tree| of| Thoughts| (|To|T|)| are| highlighted| for| enhancing| reasoning| and| planning|.
|2|.| **|Memory|**| is| categorized| into| short|-term| and| long|-term| memory|,| with| mechanisms| for| fast| retrieval| using| Maximum| Inner| Product| Search| (|M|IPS|)| algorithms|.| This| allows| agents| to| retain| and| recall| information| effectively|.
|3|.| **|Tool| Use|**| emphasizes| the| integration| of| external| APIs| and| tools| to| extend| the| capabilities| of| L|LM|s|,| enabling| them| to| perform| tasks| beyond| their| inherent| limitations|.| Examples| include| MR|KL| systems| and| frameworks| like| Hug|ging|GPT|,| which| facilitate| task| planning| and| execution|.
|The| article| also| addresses| challenges| such| as| finite| context| length|,| difficulties| in| long|-term| planning|,| and| the| reliability| of| natural| language| interfaces|.| It| concludes| with| case| studies| demonstrating| the| practical| applications| of| L|LM|-powered| agents| in| scientific| discovery| and| interactive| simulations|.| Overall|,| the| piece| illustrates| the| potential| of| L|LM|s| as| general| problem| sol|vers| and| their| evolving| role| in| autonomous| systems|.||
深入探讨
- 您可以轻松自定义提示。
- 你可以通过
llm
参数轻松尝试不同的LLMs(例如,Claude)。
Map-Reduce: 通过并行化总结长文本
让我们来解析一下映射归约方法。为此,我们首先会使用LLM将每个文档映射到一个单独的摘要。然后,我们会将这些摘要进行归约或整合,形成一个全局摘要。
请注意,映射步骤通常在输入文档上并行化。
LangGraph,建立在langchain-core
之上,支持map-reduce工作流,非常适合这个问题:
- LangGraph 允许流式传输各个步骤(例如连续摘要),从而更好地控制执行;
- LangGraph的检查点支持错误恢复,扩展了人机交互工作流程,并更容易集成到对话应用中。
- LangGraph 的实现易于修改和扩展,我们将在下面看到。
地图
首先,让我们定义与映射步骤相关的提示。我们可以使用与上述stuff
方法相同的摘要提示:
from langchain_core.prompts import ChatPromptTemplate
map_prompt = ChatPromptTemplate.from_messages(
[("system", "Write a concise summary of the following:\\n\\n{context}")]
)
我们也可以使用提示中心来存储和获取提示。
这将与您的LangSmith API key一起使用。
例如,请查看地图提示 这里。
from langchain import hub
map_prompt = hub.pull("rlm/map-prompt")
减少
我们还定义了一个提示,该提示接收文档映射结果并将其简化为单个输出。
# Also available via the hub: `hub.pull("rlm/reduce-prompt")`
reduce_template = """
The following is a set of summaries:
{docs}
Take these and distill it into a final, consolidated summary
of the main themes.
"""
reduce_prompt = ChatPromptTemplate([("human", reduce_template)])
通过LangGraph进行编排
下面我们实现一个简单的应用程序,该应用程序将摘要步骤映射到文档列表上,然后使用上述提示进行缩减。
当文本长度与大型语言模型(LLM)的上下文窗口相比较长时,Map-reduce流程特别有用。对于长文本,我们需要一种机制来确保在reduce步骤中要总结的上下文不超过模型的上下文窗口大小。在这里,我们实现了一种递归的“折叠”总结方法:输入根据令牌限制进行分区,并生成分区的总结。重复此步骤,直到总结的总长度在所需限制内,从而允许对任意长度的文本进行总结。
首先我们将博客文章分割成较小的“子文档”以便映射:
from langchain_text_splitters import CharacterTextSplitter
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=0
)
split_docs = text_splitter.split_documents(docs)
print(f"Generated {len(split_docs)} documents.")
Created a chunk of size 1003, which is longer than the specified 1000
``````output
Generated 14 documents.
接下来,我们定义我们的图。请注意,我们定义了一个人为的低最大令牌长度为1,000个令牌,以说明“折叠”步骤。
import operator
from typing import Annotated, List, Literal, TypedDict
from langchain.chains.combine_documents.reduce import (
acollapse_docs,
split_list_of_docs,
)
from langchain_core.documents import Document
from langgraph.constants import Send
from langgraph.graph import END, START, StateGraph
token_max = 1000
def length_function(documents: List[Document]) -> int:
"""Get number of tokens for input contents."""
return sum(llm.get_num_tokens(doc.page_content) for doc in documents)
# This will be the overall state of the main graph.
# It will contain the input document contents, corresponding
# summaries, and a final summary.
class OverallState(TypedDict):
# Notice here we use the operator.add
# This is because we want combine all the summaries we generate
# from individual nodes back into one list - this is essentially
# the "reduce" part
contents: List[str]
summaries: Annotated[list, operator.add]
collapsed_summaries: List[Document]
final_summary: str
# This will be the state of the node that we will "map" all
# documents to in order to generate summaries
class SummaryState(TypedDict):
content: str
# Here we generate a summary, given a document
async def generate_summary(state: SummaryState):
prompt = map_prompt.invoke(state["content"])
response = await llm.ainvoke(prompt)
return {"summaries": [response.content]}
# Here we define the logic to map out over the documents
# We will use this an edge in the graph
def map_summaries(state: OverallState):
# We will return a list of `Send` objects
# Each `Send` object consists of the name of a node in the graph
# as well as the state to send to that node
return [
Send("generate_summary", {"content": content}) for content in state["contents"]
]
def collect_summaries(state: OverallState):
return {
"collapsed_summaries": [Document(summary) for summary in state["summaries"]]
}
async def _reduce(input: dict) -> str:
prompt = reduce_prompt.invoke(input)
response = await llm.ainvoke(prompt)
return response.content
# Add node to collapse summaries
async def collapse_summaries(state: OverallState):
doc_lists = split_list_of_docs(
state["collapsed_summaries"], length_function, token_max
)
results = []
for doc_list in doc_lists:
results.append(await acollapse_docs(doc_list, _reduce))
return {"collapsed_summaries": results}
# This represents a conditional edge in the graph that determines
# if we should collapse the summaries or not
def should_collapse(
state: OverallState,
) -> Literal["collapse_summaries", "generate_final_summary"]:
num_tokens = length_function(state["collapsed_summaries"])
if num_tokens > token_max:
return "collapse_summaries"
else:
return "generate_final_summary"
# Here we will generate the final summary
async def generate_final_summary(state: OverallState):
response = await _reduce(state["collapsed_summaries"])
return {"final_summary": response}
# Construct the graph
# Nodes:
graph = StateGraph(OverallState)
graph.add_node("generate_summary", generate_summary) # same as before
graph.add_node("collect_summaries", collect_summaries)
graph.add_node("collapse_summaries", collapse_summaries)
graph.add_node("generate_final_summary", generate_final_summary)
# Edges:
graph.add_conditional_edges(START, map_summaries, ["generate_summary"])
graph.add_edge("generate_summary", "collect_summaries")
graph.add_conditional_edges("collect_summaries", should_collapse)
graph.add_conditional_edges("collapse_summaries", should_collapse)
graph.add_edge("generate_final_summary", END)
app = graph.compile()
LangGraph 允许绘制图形结构以帮助可视化其功能:
from IPython.display import Image
Image(app.get_graph().draw_mermaid_png())
运行应用程序时,我们可以流式传输图形以观察其步骤序列。下面,我们将简单地打印出步骤的名称。
请注意,由于图中存在循环,指定一个recursion_limit在其执行过程中可能会有所帮助。当超过指定的限制时,这将引发一个特定的错误。
async for step in app.astream(
{"contents": [doc.page_content for doc in split_docs]},
{"recursion_limit": 10},
):
print(list(step.keys()))
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['collect_summaries']
['collapse_summaries']
['collapse_summaries']
['generate_final_summary']
print(step)
{'generate_final_summary': {'final_summary': 'The consolidated summary of the main themes from the provided documents is as follows:\n\n1. **Integration of Large Language Models (LLMs) in Autonomous Agents**: The documents explore the evolving role of LLMs in autonomous systems, emphasizing their enhanced reasoning and acting capabilities through methodologies that incorporate structured planning, memory systems, and tool use.\n\n2. **Core Components of Autonomous Agents**:\n - **Planning**: Techniques like task decomposition (e.g., Chain of Thought) and external classical planners are utilized to facilitate long-term planning by breaking down complex tasks.\n - **Memory**: The memory system is divided into short-term (in-context learning) and long-term memory, with parallels drawn between human memory and machine learning to improve agent performance.\n - **Tool Use**: Agents utilize external APIs and algorithms to enhance problem-solving abilities, exemplified by frameworks like HuggingGPT that manage task workflows.\n\n3. **Neuro-Symbolic Architectures**: The integration of MRKL (Modular Reasoning, Knowledge, and Language) systems combines neural and symbolic expert modules with LLMs, addressing challenges in tasks such as verbal math problem-solving.\n\n4. **Specialized Applications**: Case studies, such as ChemCrow and projects in anticancer drug discovery, demonstrate the advantages of LLMs augmented with expert tools in specialized domains.\n\n5. **Challenges and Limitations**: The documents highlight challenges such as hallucination in model outputs and the finite context length of LLMs, which affects their ability to incorporate historical information and perform self-reflection. Techniques like Chain of Hindsight and Algorithm Distillation are discussed to enhance model performance through iterative learning.\n\n6. **Structured Software Development**: A systematic approach to creating Python software projects is emphasized, focusing on defining core components, managing dependencies, and adhering to best practices for documentation.\n\nOverall, the integration of structured planning, memory systems, and advanced tool use aims to enhance the capabilities of LLM-powered autonomous agents while addressing the challenges and limitations these technologies face in real-world applications.'}}
在相应的LangSmith 跟踪中,我们可以看到单独的LLM调用,这些调用被分组在各自的节点下。
深入探讨
自定义
- 如上所示,您可以自定义LLMs和提示,用于映射和归约阶段。
实际应用案例
- 查看这篇博客文章关于分析用户交互(关于LangChain文档的问题)的案例研究!
- 博客文章和相关的仓库还介绍了聚类作为摘要的一种方法。
- 这开辟了另一条路径,超越了
stuff
或map-reduce
方法,值得考虑。
下一步
我们鼓励您查看操作指南以获取更多详细信息:
以及其他概念。