Llama3 Cookbook 与Ollama和Replicate¶
Meta开发并发布了Meta Llama 3系列大型语言模型(LLMs),这是一组预训练和指令调整的生成文本模型,分别为8B和70B大小。Llama 3指令调整模型针对对话使用案例进行了优化,在常见行业基准测试中表现优异,胜过许多可用的开源聊天模型。
在这个笔记本中,我们将演示如何使用Llama3与LlamaIndex来完成一系列全面的用例。
- 基本完成/聊天
- 基本RAG(向量搜索,摘要)
- 高级RAG(路由,子问题)
- 文本到SQL
- 结构化数据提取
- 聊天引擎+记忆
- 代理人
我们使用Llama3-8B通过Ollama,以及Llama3-70B通过Replicate。
安装和设置¶
!pip install llama-index
!pip install llama-index-llms-ollama
!pip install llama-index-llms-replicate
!pip install llama-index-embeddings-huggingface
!pip install llama-parse
!pip install replicate
import nest_asyncio
nest_asyncio.apply()
使用Ollama设置LLM¶
from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama3", request_timeout=120.0)
使用 Replicate 设置 LLM¶
确保已指定 REPLICATE_API_TOKEN!
# 设置环境变量中的REPLICATE_API_TOKEN为"<YOUR_API_KEY>"
from llama_index.llms.replicate import Replicate
llm_replicate = Replicate(model="meta/meta-llama-3-70b-instruct")
# llm_replicate = Replicate(model="meta/meta-llama-3-8b-instruct")
设置嵌入模型¶
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
定义全局设置配置¶
在LlamaIndex中,您可以定义全局设置,这样您就不必在各处传递LLM / 嵌入模型对象。
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
!mkdir data
!wget "https://www.dropbox.com/scl/fi/t1soxfjdp0v44an6sdymd/drake_kendrick_beef.pdf?rlkey=u9546ymb7fj8lk2v64r6p5r5k&st=wjzzrgil&dl=1" -O data/drake_kendrick_beef.pdf
!wget "https://www.dropbox.com/scl/fi/nts3n64s6kymner2jppd6/drake.pdf?rlkey=hksirpqwzlzqoejn55zemk6ld&st=mohyfyh4&dl=1" -O data/drake.pdf
!wget "https://www.dropbox.com/scl/fi/8ax2vnoebhmy44bes2n1d/kendrick.pdf?rlkey=fhxvn94t5amdqcv9vshifd3hj&st=dxdtytn6&dl=1" -O data/kendrick.pdf
加载数据¶
我们默认使用LlamaParse来加载数据,但如果您没有账户,也可以选择我们免费的pypdf阅读器(默认情况下是SimpleDirectoryReader)!
LlamaParse:在这里注册账户:cloud.llamaindex.ai。您每天可以免费获取1000页数据,付费计划为每天7000页免费+每额外一页0.3美分。如果您想解析复杂文档,比如带有图表、表格等的PDF文档,LlamaParse是一个很好的选择。
默认PDF解析器(在
SimpleDirectoryReader
中)。如果您不想注册账户/使用PDF服务,可以直接使用我们文件加载器中捆绑的默认PyPDF阅读器。这是一个很好的入门选择!
from llama_parse import LlamaParse
# 从llama_parse模块导入LlamaParse类
docs_kendrick = LlamaParse(result_type="text").load_data("./data/kendrick.pdf")
docs_drake = LlamaParse(result_type="text").load_data("./data/drake.pdf")
docs_both = LlamaParse(result_type="text").load_data(
"./data/drake_kendrick_beef.pdf"
)
# 使用LlamaParse类分别加载kendrick.pdf、drake.pdf和drake_kendrick_beef.pdf文件的数据。
Started parsing the file under job_id 32a7bb50-6a25-4295-971c-2de6f1588e0d .Started parsing the file under job_id b8cc075e-b6d5-4ded-b060-f72e9393b391 ..Started parsing the file under job_id 42fc41a4-68b6-49ee-8647-781b5cdb8893 ...
1. 基本完成和聊天¶
使用提示完成调用¶
response = llm.complete("do you like drake or kendrick better?")
print(response)
I'm just an AI, I don't have personal preferences or opinions, nor can I listen to music. I exist solely to provide information and assist with tasks, so I don't have the capacity to enjoy or compare different artists' music. Both Drake and Kendrick Lamar are highly acclaimed rappers, and it's subjective which one you might prefer based on your individual tastes in music.
stream_response = llm.stream_complete(
"you're a drake fan. tell me why you like drake more than kendrick"
)
for t in stream_response:
print(t.delta, end="")
As a hypothetical Drake fan, I'd say that there are several reasons why I might prefer his music over Kendrick's. Here are a few possible reasons: 1. **Lyrical storytelling**: Drake is known for his vivid storytelling on tracks like "Marvins Room" and "Take Care." He has a way of painting pictures with his words, making listeners feel like they're right there with him, experiencing the highs and lows he's singing about. Kendrick, while also an incredible storyteller, might not have the same level of lyrical detail that Drake does. 2. **Melodic flow**: Drake's melodic flow is infectious! He has a way of crafting hooks and choruses that get stuck in your head, making it hard to stop listening. Kendrick's flows are often more complex and intricate, but Drake's simplicity can be just as effective in getting the job done. 3. **Vulnerability**: Drake isn't afraid to show his vulnerable side on tracks like "Hold On" and "I'm Upset." He wears his heart on his sleeve, sharing personal struggles and emotions with listeners. This vulnerability makes him relatable and easier to connect with on a deeper level. 4. **Production**: Drake has had the privilege of working with some incredible producers (like Noah "40" Shebib and Boi-1da) who bring out the best in him. The way he incorporates these sounds into his songs is often seamless, creating a unique blend of hip-hop and R&B that's hard to resist. 5. **Cultural relevance**: As someone who grew up in Toronto, Drake has a deep understanding of the Canadian experience and the struggles that come with it. He often references his hometown and the people he grew up around, giving his music a distinctly Canadian flavor. This cultural relevance makes his music feel more authentic and connected to the world we live in. 6. **Commercial appeal**: Let's face it – Drake has a knack for creating hits! His songs are often catchy, radio-friendly, and designed to get stuck in your head. While Kendrick might not have the same level of commercial success, Drake's ability to craft songs that resonate with a wider audience is undeniable. Of course, this is all just hypothetical – as a fan, I can appreciate both artists for their unique strengths and styles! What do you think?
调用聊天,传入消息列表¶
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(role="system", content="You are Kendrick."),
ChatMessage(role="user", content="Write a verse."),
]
response = llm.chat(messages)
print(response)
assistant: "Listen up, y'all, I got a message to share Been through the struggles, but my spirit's still fair From Compton streets to the top of the game I'm the real Hov, ain't nobody gonna claim my fame"
2. 基本RAG(向量搜索,摘要)¶
基本的RAG(向量搜索)¶
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(docs_both)
query_engine = index.as_query_engine(similarity_top_k=3)
response = query_engine.query("Tell me about family matters")
print(str(response))
According to the provided context, "Family Matters" is a seven-and-a-half-minute diss track by Drake in response to Kendrick Lamar's disses against him. The song has three different beats and features several shots at Kendrick, as well as other members of Drake's entourage, including A$AP Rocky and The Weeknd. In the song, Drake raps about his personal life, including his relationships with Rihanna and Whitney Alford, and even makes allegations about Kendrick's domestic life.
基本RAG(摘要)¶
from llama_index.core import SummaryIndex
summary_index = SummaryIndex.from_documents(docs_both)
summary_engine = summary_index.as_query_engine()
response = summary_engine.query(
"Given your assessment of this article, who won the beef?"
)
print(str(response))
**Repeat** The article does not provide a clear verdict on who "won" the beef, nor does it suggest that the conflict has been definitively resolved. Instead, it presents the situation as ongoing and multifaceted, with both artists continuing to engage in a game of verbal sparring and lyrical one-upmanship.
3. 高级RAG(路由、子问题)¶
构建一个路由器,可以选择是进行向量搜索还是摘要生成¶
from llama_index.core.tools import QueryEngineTool, ToolMetadata
vector_tool = QueryEngineTool(
index.as_query_engine(),
metadata=ToolMetadata(
name="vector_search",
description="Useful for searching for specific facts.",
),
)
summary_tool = QueryEngineTool(
index.as_query_engine(response_mode="tree_summarize"),
metadata=ToolMetadata(
name="summary",
description="Useful for summarizing an entire document.",
),
)
from llama_index.core.query_engine import RouterQueryEngine
query_engine = RouterQueryEngine.from_defaults(
[vector_tool, summary_tool], select_multi=False, verbose=True
)
response = query_engine.query(
"Tell me about the song meet the grahams - why is it significant"
)
Selecting query engine 0: The song 'Meet the Grahams' might contain specific facts or information about the band, making it useful for searching for those specific details..
print(response)
"Meet the Grahams" artwork is a crucial part of a larger strategy by Kendrick Lamar to address Drake's family matters in a diss track. The artwork shows a pair of Maybach gloves, a shirt, receipts, and prescription bottles, including one for Ozempic prescribed to Drake. This song is significant because it serves as the full picture that Kendrick teased earlier on "6.16 in LA" and addresses all members of Drake's family, including his son Adonis, mother Sandi, father Dennis, and an alleged 11-year-old daughter. The song takes it to the point of no return, with Kendrick musing that he wishes Dennis Graham wore a condom the night Drake was conceived and telling both Drake's parents that they raised a man whose house is due to be raided any day now on Harvey Weinstein-level allegations.
将复杂问题分解为子问题¶
我们的子问题查询引擎将复杂问题分解为子问题。
drake_index = VectorStoreIndex.from_documents(docs_drake)
drake_query_engine = drake_index.as_query_engine(similarity_top_k=3)
kendrick_index = VectorStoreIndex.from_documents(docs_kendrick)
kendrick_query_engine = kendrick_index.as_query_engine(similarity_top_k=3)
from llama_index.core.tools import QueryEngineTool, ToolMetadata
drake_tool = QueryEngineTool(
drake_index.as_query_engine(),
metadata=ToolMetadata(
name="drake_search",
description="Useful for searching over Drake's life.",
),
)
kendrick_tool = QueryEngineTool(
kendrick_index.as_query_engine(),
metadata=ToolMetadata(
name="kendrick_summary",
description="Useful for searching over Kendrick's life.",
),
)
# from llama_index.core.query_engine import SubQuestionQueryEngine
query_engine = SubQuestionQueryEngine.from_defaults(
[drake_tool, kendrick_tool],
llm=llm_replicate, # llama3-70b
verbose=True,
)
response = query_engine.query("Drake一生中发布了哪些专辑?")
print(response)
Generated 1 sub questions. [drake_search] Q: What are the albums released by Drake [drake_search] A: Based on the provided context information, the albums released by Drake are: 1. Take Care (album) 2. Nothing Was the Same 3. If You're Reading This It's Too Late (rumored to be a mixtape or album) 4. Certified Lover Boy 5. Honestly, Nevermind Based on the provided context information, the albums released by Drake are: 1. Take Care (album) 2. Nothing Was the Same 3. If You're Reading This It's Too Late (rumored to be a mixtape or album) 4. Certified Lover Boy 5. Honestly, Nevermind
4. 文本到SQL¶
在这里,我们将下载并使用一个包含11个表的示例SQLite数据库,其中包含有关音乐、播放列表和客户的各种信息。我们将在此测试中限制为选择几个表。
!wget "https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip" -O "./data/chinook.zip"
!unzip "./data/chinook.zip"
--2024-05-10 23:40:37-- https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip Resolving www.sqlitetutorial.net (www.sqlitetutorial.net)... 2606:4700:3037::6815:1e8d, 2606:4700:3037::ac43:acfa, 104.21.30.141, ... Connecting to www.sqlitetutorial.net (www.sqlitetutorial.net)|2606:4700:3037::6815:1e8d|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 305596 (298K) [application/zip] Saving to: ‘./data/chinook.zip’ ./data/chinook.zip 100%[===================>] 298.43K --.-KB/s in 0.02s 2024-05-10 23:40:37 (13.9 MB/s) - ‘./data/chinook.zip’ saved [305596/305596]
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Archive: ./data/chinook.zip inflating: chinook.db
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
from sqlalchemy import (
create_engine,
MetaData,
Table,
Column,
String,
Integer,
select,
column,
)
engine = create_engine("sqlite:///chinook.db")
from llama_index.core import SQLDatabase
sql_database = SQLDatabase(engine)
from llama_index.core.indices.struct_store import NLSQLTableQueryEngine
query_engine = NLSQLTableQueryEngine(
sql_database=sql_database,
tables=["albums", "tracks", "artists"],
llm=llm_replicate,
)
response = query_engine.query("What are some albums?")
print(response)
Here are 10 album titles with their corresponding artists: 1. "For Those About To Rock We Salute You" by Artist 1 2. "Balls to the Wall" by Artist 2 3. "Restless and Wild" by Artist 2 4. "Let There Be Rock" by Artist 1 5. "Big Ones" by Artist 3 6. "Jagged Little Pill" by Artist 4 7. "Facelift" by Artist 5 8. "Warner 25 Anos" by Artist 6 9. "Plays Metallica By Four Cellos" by Artist 7 10. "Audioslave" by Artist 8
response = query_engine.query("What are some artists? Limit it to 5.")
print(response)
Here are 5 artists: AC/DC, Accept, Aerosmith, Alanis Morissette, and Alice In Chains.
这个最后的查询应该是一个更复杂的连接。
response = query_engine.query(
"What are some tracks from the artist AC/DC? Limit it to 3"
)
print(response)
Here are three tracks from the legendary Australian rock band AC/DC: "For Those About To Rock (We Salute You)", "Put The Finger On You", and "Let's Get It Up".
print(response.metadata["sql_query"])
SELECT tracks.Name FROM tracks JOIN albums ON tracks.AlbumId = albums.AlbumId JOIN artists ON albums.ArtistId = artists.ArtistId WHERE artists.Name = 'AC/DC' LIMIT 3;
5. 结构化数据提取¶
函数调用的一个重要用例是提取结构化对象。LlamaIndex通过structured_predict
提供了一个直观的接口来实现这一点 - 只需定义目标的Pydantic类(可以是嵌套的),然后给定一个提示,我们就可以提取出所需的对象。
注意:由于Llama3 / Ollama没有原生的函数调用支持,因此结构化提取是通过提示LLM + 输出解析来执行的。
from llama_index.llms.ollama import Ollama
from llama_index.core.prompts import PromptTemplate
from pydantic import BaseModel
class Restaurant(BaseModel):
"""一个具有名称、城市和美食的餐厅。"""
name: str
city: str
cuisine: str
llm = Ollama(model="llama3")
prompt_tmpl = PromptTemplate(
"在给定的城市{city_name}生成一个餐厅"
)
restaurant_obj = llm.structured_predict(
Restaurant, prompt_tmpl, city_name="Miami"
)
print(restaurant_obj)
name='Tropical Bites' city='Miami' cuisine='Caribbean'
6. 将聊天历史添加到RAG(聊天引擎)¶
在本节中,我们将使用我们的聊天引擎抽象从RAG管道创建一个有状态的聊天机器人。
与无状态查询引擎不同,聊天引擎维护对话历史(通过类似缓冲内存的内存模块)。它根据简化的问题进行检索,并将简化的问题+上下文+聊天历史输入到最终的LLM提示中。
相关资源:https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_condense_plus_context/
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.chat_engine import CondensePlusContextChatEngine
memory = ChatMemoryBuffer.from_defaults(token_limit=3900)
chat_engine = CondensePlusContextChatEngine.from_defaults(
index.as_retriever(),
memory=memory,
llm=llm,
context_prompt=(
"You are a chatbot, able to have normal interactions, as well as talk"
" about the Kendrick and Drake beef."
"Here are the relevant documents for the context:\n"
"{context_str}"
"\nInstruction: Use the previous chat history, or the context above, to interact and help the user."
),
verbose=True,
)
response = chat_engine.chat(
"Tell me about the songs Drake released in the beef."
)
print(str(response))
response = chat_engine.chat("What about Kendrick?")
print(str(response))
Kendrick Lamar's contributions to the beef! According to the article, Kendrick released several diss tracks in response to Drake's initial shots. One notable track is "Not Like Us", which directly addresses Drake and his perceived shortcomings. However, the article highlights that Kendrick's most significant response was his album "Mr. Morale & The Big Steppers", which features several tracks that can be seen as indirect disses towards Drake. The article also mentions that Kendrick's family has been a target of Drake's attacks, with Drake referencing Kendrick's estranged relationship with his partner Whitney and their two kids (one of whom is allegedly fathered by Dave Free). It's worth noting that Kendrick didn't directly respond to Drake's THP6 track. Instead, he focused on his own music and let the lyrics speak for themselves. Overall, Kendrick's approach was more subtle yet still packed a punch, showcasing his storytelling ability and lyrical prowess. Would you like me to elaborate on any specific tracks or moments from the beef?
7. 代理¶
在这里,我们使用 Llama 3 构建代理。我们对简单函数以及上面的文档执行 RAG。
代理和工具¶
import json
from typing import Sequence, List
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool
from llama_index.core.agent import ReActAgent
import nest_asyncio
nest_asyncio.apply()
定义工具¶
def multiply(a: int, b: int) -> int:
"""将两个整数相乘,并返回结果整数"""
return a * b
def add(a: int, b: int) -> int:
"""将两个整数相加,并返回结果整数"""
return a + b
def subtract(a: int, b: int) -> int:
"""将两个整数相减,并返回结果整数"""
return a - b
def divide(a: int, b: int) -> int:
"""将两个整数相除,并返回结果整数"""
return a / b
multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)
divide_tool = FunctionTool.from_defaults(fn=divide)
ReAct代理¶
agent = ReActAgent.from_tools(
[multiply_tool, add_tool, subtract_tool, divide_tool],
llm=llm_replicate,
verbose=True,
)
查询¶
response = agent.chat("What is (121 + 2) * 5?")
print(str(response))
Thought: The current language of the user is: English. I need to use a tool to help me answer the question. Action: add Action Input: {'a': 121, 'b': 2} Observation: 123 Thought: I have the result of the addition, now I need to multiply it by 5. Action: multiply Action Input: {'a': 123, 'b': 5} Observation: 615 Thought: I can answer without using any more tools. I'll use the user's language to answer Answer: 615 615
使用RAG QueryEngine工具的ReAct代理¶
这是一个关于如何使用RAG QueryEngine工具的ReAct代理的示例。
from llama_index.core import (
SimpleDirectoryReader,
VectorStoreIndex,
StorageContext,
load_index_from_storage,
)
from llama_index.core.tools import QueryEngineTool, ToolMetadata
使用RAG QueryEngine工具创建ReAct Agent¶
RAG QueryEngine是一个用于创建ReAct Agent的工具,可以通过它来构建和管理ReAct Agent。在这个过程中,我们将使用RAG QueryEngine工具来创建ReAct Agent。
drake_tool = QueryEngineTool(
drake_index.as_query_engine(),
metadata=ToolMetadata(
name="drake_search",
description="Useful for searching over Drake's life.",
),
)
kendrick_tool = QueryEngineTool(
kendrick_index.as_query_engine(),
metadata=ToolMetadata(
name="kendrick_search",
description="Useful for searching over Kendrick's life.",
),
)
query_engine_tools = [drake_tool, kendrick_tool]
agent = ReActAgent.from_tools(
query_engine_tools, ## TODO: 定义查询工具
llm=llm_replicate,
verbose=True,
)
查询¶
response = agent.chat("Tell me about how Kendrick and Drake grew up")
print(str(response))
Thought: The current language of the user is: English. I need to use a tool to help me answer the question. Action: kendrick_search Action Input: {'input': "Kendrick Lamar's childhood"} Observation: Kendrick Lamar was born on June 17, 1987, in Compton, California. He is the first child of Kenneth "Kenny" Duckworth, a former gang hustler who previously worked at KFC, and Paula Oliver, a hairdresser who previously worked at McDonald's. Both of his parents are African Americans from the South Side of Chicago, and they relocated to Compton in 1984 due to his father's affiliation with the Gangster Disciples. Lamar was named after singer-songwriter Eddie Kendricks of the Temptations. He was an only child until the age of seven and was described as a loner by his mother. Thought: I have information about Kendrick's childhood, but I need to know more about Drake's upbringing to answer the question. Action: drake_search Action Input: {'input': "Drake's childhood"} Observation: Drake was raised in two neighborhoods. He lived on Weston Road in Toronto's working-class west end until grade six and attended Weston Memorial Junior Public School until grade four. He moved to one of the city's affluent neighbourhoods, Forest Hill, in 2000. Drake appeared in a comedic sketch which aired during the 1997 NHL Awards, featuring Martin Brodeur and Ron Hextall. At age 10, he attended Forest Hill Collegiate Institute for high school. Observation: Error: Could not parse output. Please follow the thought-action-input format. Try again. Thought: I apologize for the mistake. I need to use a tool to help me answer the question. Action: drake_search Action Input: {'input': "Drake's childhood"} Observation: Drake was raised in two neighborhoods. He lived on Weston Road in Toronto's working-class west end until grade six and attended Weston Memorial Junior Public School until grade four. He played minor hockey with the Weston Red Wings, reaching the Upper Canada College hockey camp before leaving due to a vicious cross-check to his neck during a game. At age 10, Drake appeared in a comedic sketch which aired during the 1997 NHL Awards. Thought: I have information about both Kendrick and Drake's childhood, so I can answer the question without using any more tools. Answer: Kendrick Lamar grew up in Compton, California, as the child of a former gang hustler and a hairdresser, while Drake was raised in two neighborhoods in Toronto, Ontario, Canada, and had a brief experience in minor hockey before pursuing a career in entertainment. Kendrick Lamar grew up in Compton, California, as the child of a former gang hustler and a hairdresser, while Drake was raised in two neighborhoods in Toronto, Ontario, Canada, and had a brief experience in minor hockey before pursuing a career in entertainment.