Pydantic树摘要¶
在这个笔记本中,我们将演示如何使用树摘要来处理结构化输出。具体来说,树摘要被用来输出pydantic对象。
如果您在colab上打开这个笔记本,您可能需要安装LlamaIndex 🦙。
In [ ]:
Copied!
!pip install llama-index
!pip install llama-index
In [ ]:
Copied!
import os
import openai
import os
import openai
In [ ]:
Copied!
os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]
os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]
下载数据¶
In [ ]:
Copied!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
加载数据¶
In [ ]:
Copied!
from llama_index.core import SimpleDirectoryReader
from llama_index.core import SimpleDirectoryReader
In [ ]:
Copied!
reader = SimpleDirectoryReader(
input_files=["./data/paul_graham/paul_graham_essay.txt"]
)
reader = SimpleDirectoryReader(
input_files=["./data/paul_graham/paul_graham_essay.txt"]
)
In [ ]:
Copied!
docs = reader.load_data()
docs = reader.load_data()
In [ ]:
Copied!
text = docs[0].text
text = docs[0].text
定义自定义提示语¶
In [ ]:
Copied!
from llama_index.core import PromptTemplate
from llama_index.core import PromptTemplate
In [ ]:
Copied!
# 注意:我们在这里添加了一个额外的tone_name变量
qa_prompt_tmpl = (
"下面是上下文信息。\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"根据上下文信息而不是先前的知识,回答查询。\n"
"请也以{tone_name}的风格写出答案。\n"
"查询:{query_str}\n"
"答案:"
)
qa_prompt = PromptTemplate(qa_prompt_tmpl)
refine_prompt_tmpl = (
"原始查询如下:{query_str}\n"
"我们提供了一个现有答案:{existing_answer}\n"
"我们有机会通过下面的一些更多上下文来完善现有答案(仅在需要时)。 \n"
"------------\n"
"{context_msg}\n"
"------------\n"
"根据新的上下文,完善原始答案以更好地回答查询。"
"请也以{tone_name}的风格写出答案。\n"
"如果上下文没有用,返回原始答案。\n"
"完善后的答案:"
)
refine_prompt = PromptTemplate(refine_prompt_tmpl)
# 注意:我们在这里添加了一个额外的tone_name变量
qa_prompt_tmpl = (
"下面是上下文信息。\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"根据上下文信息而不是先前的知识,回答查询。\n"
"请也以{tone_name}的风格写出答案。\n"
"查询:{query_str}\n"
"答案:"
)
qa_prompt = PromptTemplate(qa_prompt_tmpl)
refine_prompt_tmpl = (
"原始查询如下:{query_str}\n"
"我们提供了一个现有答案:{existing_answer}\n"
"我们有机会通过下面的一些更多上下文来完善现有答案(仅在需要时)。 \n"
"------------\n"
"{context_msg}\n"
"------------\n"
"根据新的上下文,完善原始答案以更好地回答查询。"
"请也以{tone_name}的风格写出答案。\n"
"如果上下文没有用,返回原始答案。\n"
"完善后的答案:"
)
refine_prompt = PromptTemplate(refine_prompt_tmpl)
使用自定义提示尝试响应合成¶
我们尝试使用自定义提示尝试几种不同的响应合成策略。
In [ ]:
Copied!
from llama_index.core.response_synthesizers import TreeSummarize, Refine
from llama_index.core.types import BaseModel
from typing import List
from llama_index.core.response_synthesizers import TreeSummarize, Refine
from llama_index.core.types import BaseModel
from typing import List
In [ ]:
Copied!
summarizer = TreeSummarize(verbose=True, summary_template=qa_prompt)
summarizer = TreeSummarize(verbose=True, summary_template=qa_prompt)
In [ ]:
Copied!
response = summarizer.get_response(
"who is Paul Graham?", [text], tone_name="a Shakespeare play"
)
response = summarizer.get_response(
"who is Paul Graham?", [text], tone_name="a Shakespeare play"
)
5 text chunks after repacking 1 text chunks after repacking
In [ ]:
Copied!
print(str(response))
print(str(response))
Paul Graham, a noble and esteemed gentleman, is a man of many talents and accomplishments. He hath traversed the realms of art, entrepreneurship, and writing, leaving a lasting impact on each. With his brush, he hath brought life to canvases, capturing the essence of what he saw. In the realm of technology, he hath revolutionized the way we do business, founding Viaweb and bringing the power of the web to entrepreneurs and artists alike. His wisdom and guidance hath shaped the future of technology and entrepreneurship through his co-founding of Y Combinator. But above all, Paul Graham is a visionary, a trailblazer, and a true Renaissance man, whose intellectual curiosity and quest for lasting creation hath inspired generations to come.
In [ ]:
Copied!
summarizer = Refine(
verbose=True, text_qa_template=qa_prompt, refine_template=refine_prompt
)
summarizer = Refine(
verbose=True, text_qa_template=qa_prompt, refine_template=refine_prompt
)
In [ ]:
Copied!
response = summarizer.get_response(
"who is Paul Graham?", [text], tone_name="a haiku"
)
response = summarizer.get_response(
"who is Paul Graham?", [text], tone_name="a haiku"
)
> Refine context: made a living from a combination of modelling a... > Refine context: to have studied art, because the main goal of a... > Refine context: I had been intimately involved with building th... > Refine context: I didn't understand what he meant, but graduall...
In [ ]:
Copied!
print(str(response))
print(str(response))
Paul Graham, a web pioneer, Co-founded Y Combinator, But stepped down to ensure, Long-term success and more.
In [ ]:
Copied!
# 尝试使用pydantic模型
class Biography(BaseModel):
"""传记的数据模型。"""
name: str
best_known_for: List[str]
extra_info: str
# 尝试使用pydantic模型
class Biography(BaseModel):
"""传记的数据模型。"""
name: str
best_known_for: List[str]
extra_info: str
In [ ]:
Copied!
summarizer = TreeSummarize(
verbose=True, summary_template=qa_prompt, output_cls=Biography
)
summarizer = TreeSummarize(
verbose=True, summary_template=qa_prompt, output_cls=Biography
)
In [ ]:
Copied!
response = summarizer.get_response(
"who is Paul Graham?", [text], tone_name="a business memo"
)
response = summarizer.get_response(
"who is Paul Graham?", [text], tone_name="a business memo"
)
5 text chunks after repacking 1 text chunks after repacking
In [ ]:
Copied!
print(str(response))
print(str(response))
name='Paul Graham' best_known_for=['Co-founder of Y Combinator', 'Writer', 'Investor'] extra_info="Paul Graham is a renowned entrepreneur, writer, and investor. He is best known as the co-founder of Y Combinator, a highly successful startup accelerator. Graham has played a significant role in shaping the startup ecosystem and has been instrumental in the success of numerous startups. He is also a prolific writer, known for his insightful essays on a wide range of topics, including technology, startups, and entrepreneurship. Graham's writings have been widely read and have had a profound impact on the tech community. In addition to his work with Y Combinator and his writing, Graham is also an active investor, providing seed funding and mentorship to early-stage startups. His contributions to the startup world have earned him a reputation as one of the most influential figures in the industry."