使用路由的查询管道¶

在这里，我们展示了我们使用路由的查询管道。

路由允许我们根据查询和一组选择动态选择要使用的底层查询管道。

我们在我们的路由查询引擎指南中提供了这个开箱即用的抽象。在这里，我们将向您展示如何使用我们的查询管道语法来组合类似的管道 - 这不仅允许您定义查询引擎，还可以轻松地将其与计算图中的其他模块链/有向无环图(DAG)连接起来。

加载数据¶

加载保罗·格雷厄姆的文章作为示例。

In [ ]:

Copied!

%pip install llama-index-llms-openai
%pip install llama-index-llms-openai

In [ ]:

Copied!

!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt' -O pg_essay.txt
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt' -O pg_essay.txt

--2024-01-10 12:31:00--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘pg_essay.txt’

pg_essay.txt        100%[===================>]  73.28K  --.-KB/s    in 0.01s   

2024-01-10 12:31:00 (6.32 MB/s) - ‘pg_essay.txt’ saved [75042/75042]

In [ ]:

Copied!

from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_files=["pg_essay.txt"])
documents = reader.load_data()
from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_files=["pg_essay.txt"])
documents = reader.load_data()

在Elasticsearch中，可以设置查询管道（Query Pipeline）来定义一系列的查询操作，以便在执行搜索请求时按顺序应用这些操作。此外，还可以使用路由（Routing）来将索引中的文档路由到特定的分片中。在本教程中，我们将学习如何设置查询管道并使用路由将文档发送到特定的分片中。

定义模块¶

我们定义了llm、向量索引、摘要索引和提示模板。

In [ ]:

Copied!

from llama_index.core.query_pipeline import QueryPipeline, InputComponentfrom typing import Dict, Any, List, Optionalfrom llama_index.llms.openai import OpenAIfrom llama_index.core import Document, VectorStoreIndexfrom llama_index.core import SummaryIndexfrom llama_index.core.response_synthesizers import TreeSummarizefrom llama_index.core.schema import NodeWithScore, TextNodefrom llama_index.core import PromptTemplatefrom llama_index.core.selectors import LLMSingleSelector# 定义HyDE模板hyde_str = """\请写一段回答问题的文章：{query_str}尽量包含尽可能多的关键细节。文章："""hyde_prompt = PromptTemplate(hyde_str)# 定义llmllm = OpenAI(model="gpt-3.5-turbo")# 定义综合器summarizer = TreeSummarize(llm=llm)# 定义向量检索器vector_index = VectorStoreIndex.from_documents(documents)vector_query_engine = vector_index.as_query_engine(similarity_top_k=2)# 定义摘要查询提示+检索器summary_index = SummaryIndex.from_documents(documents)summary_qrewrite_str = """\这是一个问题：{query_str}您负责将问题提供给一个代理，该代理在给定上下文的情况下尝试回答问题。上下文可能与问题相关，也可能不相关。重写问题以突出只有一些上下文片段（或没有）可能相关的事实。"""summary_qrewrite_prompt = PromptTemplate(summary_qrewrite_str)summary_query_engine = summary_index.as_query_engine()# 定义选择器selector = LLMSingleSelector.from_defaults()
from llama_index.core.query_pipeline import QueryPipeline, InputComponentfrom typing import Dict, Any, List, Optionalfrom llama_index.llms.openai import OpenAIfrom llama_index.core import Document, VectorStoreIndexfrom llama_index.core import SummaryIndexfrom llama_index.core.response_synthesizers import TreeSummarizefrom llama_index.core.schema import NodeWithScore, TextNodefrom llama_index.core import PromptTemplatefrom llama_index.core.selectors import LLMSingleSelector# 定义HyDE模板hyde_str = """\请写一段回答问题的文章：{query_str}尽量包含尽可能多的关键细节。文章："""hyde_prompt = PromptTemplate(hyde_str)# 定义llmllm = OpenAI(model="gpt-3.5-turbo")# 定义综合器summarizer = TreeSummarize(llm=llm)# 定义向量检索器vector_index = VectorStoreIndex.from_documents(documents)vector_query_engine = vector_index.as_query_engine(similarity_top_k=2)# 定义摘要查询提示+检索器summary_index = SummaryIndex.from_documents(documents)summary_qrewrite_str = """\这是一个问题：{query_str}您负责将问题提供给一个代理，该代理在给定上下文的情况下尝试回答问题。上下文可能与问题相关，也可能不相关。重写问题以突出只有一些上下文片段（或没有）可能相关的事实。"""summary_qrewrite_prompt = PromptTemplate(summary_qrewrite_str)summary_query_engine = summary_index.as_query_engine()# 定义选择器selector = LLMSingleSelector.from_defaults()

构建查询管道¶

为向量索引、摘要索引定义一个查询管道，并将其与路由器连接在一起。

In [ ]:

Copied!

# 定义摘要查询流程from llama_index.core.query_pipeline import RouterComponentvector_chain = QueryPipeline(chain=[vector_query_engine])summary_chain = QueryPipeline(    chain=[summary_qrewrite_prompt, llm, summary_query_engine], verbose=True)choices = [    "该工具回答关于文档的具体问题（而不是文档整体摘要的问题）",    "该工具回答关于文档的摘要问题（而不是具体问题）",]router_c = RouterComponent(    selector=selector,    choices=choices,    components=[vector_chain, summary_chain],    verbose=True,)# 顶层流程qp = QueryPipeline(chain=[router_c], verbose=True)
# 定义摘要查询流程from llama_index.core.query_pipeline import RouterComponentvector_chain = QueryPipeline(chain=[vector_query_engine])summary_chain = QueryPipeline(    chain=[summary_qrewrite_prompt, llm, summary_query_engine], verbose=True)choices = [    "该工具回答关于文档的具体问题（而不是文档整体摘要的问题）",    "该工具回答关于文档的摘要问题（而不是具体问题）",]router_c = RouterComponent(    selector=selector,    choices=choices,    components=[vector_chain, summary_chain],    verbose=True,)# 顶层流程qp = QueryPipeline(chain=[router_c], verbose=True)

在这个notebook中，我们将尝试一些查询来熟悉数据库查询的语法和功能。我们将使用SQLAlchemy来执行这些查询，这是一个流行的Python SQL工具和对象关系映射器。我们将使用SQLite数据库作为示例数据库。

In [ ]:

Copied!

# 与同步方法进行比较response = qp.run("作者在YC期间做了什么？")print(str(response))
# 与同步方法进行比较response = qp.run("作者在YC期间做了什么？")print(str(response))

> Running module c0a87442-3165-443d-9709-960e6ddafe7f with input: 
query: What did the author do during his time in YC?

Selecting component 0: The author used a tool to answer specific questions about the document, which suggests that he was engaged in analyzing and extracting specific information from the document during his time in YC..
During his time in YC, the author worked on various tasks related to running Y Combinator. This included selecting and helping founders, dealing with disputes between cofounders, figuring out when people were lying, and fighting with people who maltreated the startups. The author also worked on writing essays and internal software for YC.

In [ ]:

Copied!

response = qp.run("What is a summary of this document?")
print(str(response))
response = qp.run("What is a summary of this document?")
print(str(response))

> Running module c0a87442-3165-443d-9709-960e6ddafe7f with input: 
query: What is a summary of this document?

Selecting component 1: The summary questions about the document are answered by this tool..
> Running module 0e7e9d49-4c92-45a9-b3bf-0e6ab76b51f9 with input: 
query_str: What is a summary of this document?

> Running module b0ece4e3-e6cd-4229-8663-b0cd0638683c with input: 
messages: Here's a question:
What is a summary of this document?

You are responsible for feeding the question to an agent that given context will try to answer the question.
The context may or may not be relev...

> Running module f247ae78-a71c-4347-ba49-d9357ee93636 with input: 
input: assistant: What is the summary of the document?

The document discusses the development and evolution of Lisp as a programming language. It highlights how Lisp was originally created as a formal model of computation and later transformed into a programming language with the assistance of Steve Russell. The document also emphasizes the unique power and elegance of Lisp in comparison to other languages.