Llama2 + VectorStoreIndex¶

本笔记本介绍了使用LlamaIndex与llama-2的正确设置步骤。具体来说，我们将看看如何使用向量存储索引。

设置¶

如果您在colab上打开这个笔记本，您可能需要安装LlamaIndex 🦙。

In [ ]:

Copied!

%pip install llama-index-llms-replicate
%pip install llama-index-llms-replicate

In [ ]:

Copied!

!pip install llama-index
!pip install llama-index

密钥¶

In [ ]:

Copied!

import os

os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["REPLICATE_API_TOKEN"] = "YOUR_REPLICATE_TOKEN"
import os

os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["REPLICATE_API_TOKEN"] = "YOUR_REPLICATE_TOKEN"

加载文档，构建VectorStoreIndex¶

In [ ]:

Copied!

# 可选的日志记录# 导入日志记录# 导入sys# logging.basicConfig(stream=sys.stdout, level=logging.INFO)# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))from llama_index.core import VectorStoreIndex, SimpleDirectoryReaderfrom IPython.display import Markdown, display
# 可选的日志记录# 导入日志记录# 导入sys# logging.basicConfig(stream=sys.stdout, level=logging.INFO)# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))from llama_index.core import VectorStoreIndex, SimpleDirectoryReaderfrom IPython.display import Markdown, display

In [ ]:

Copied!

from llama_index.llms.replicate import Replicatefrom llama_index.core.llms.llama_utils import (    messages_to_prompt,    completion_to_prompt,)# 复制端点LLAMA_13B_V2_CHAT = "a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5"# 将自定义系统提示注入到llama-2中def custom_completion_to_prompt(completion: str) -> str:    return completion_to_prompt(        completion,        system_prompt=(            "您是一个问答助手。您的目标是根据提供的指示和上下文尽可能准确地回答问题。"        ),    )llm = Replicate(    model=LLAMA_13B_V2_CHAT,    temperature=0.01,    # 由于它被解释为上下文窗口而不是最大标记数，因此覆盖最大标记数    context_window=4096,    # 为llama 2覆盖完成表示    completion_to_prompt=custom_completion_to_prompt,    # 如果使用llama 2进行数据代理，还需要覆盖消息表示    messages_to_prompt=messages_to_prompt,)
from llama_index.llms.replicate import Replicatefrom llama_index.core.llms.llama_utils import (    messages_to_prompt,    completion_to_prompt,)# 复制端点LLAMA_13B_V2_CHAT = "a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5"# 将自定义系统提示注入到llama-2中def custom_completion_to_prompt(completion: str) -> str:    return completion_to_prompt(        completion,        system_prompt=(            "您是一个问答助手。您的目标是根据提供的指示和上下文尽可能准确地回答问题。"        ),    )llm = Replicate(    model=LLAMA_13B_V2_CHAT,    temperature=0.01,    # 由于它被解释为上下文窗口而不是最大标记数，因此覆盖最大标记数    context_window=4096,    # 为llama 2覆盖完成表示    completion_to_prompt=custom_completion_to_prompt,    # 如果使用llama 2进行数据代理，还需要覆盖消息表示    messages_to_prompt=messages_to_prompt,)

In [ ]:

Copied!

from llama_index.core import Settings

Settings.llm = llm
from llama_index.core import Settings

Settings.llm = llm

下载数据

In [ ]:

Copied!

# 加载文档documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
# 加载文档documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

In [ ]:

Copied!

index = VectorStoreIndex.from_documents(documents)
index = VectorStoreIndex.from_documents(documents)

查询¶

In [ ]:

Copied!

# 将日志级别设置为DEBUG，以获得更详细的输出query_engine = index.as_query_engine()
# 将日志级别设置为DEBUG，以获得更详细的输出query_engine = index.as_query_engine()

In [ ]:

Copied!

response = query_engine.query("What did the author do growing up?")
display(Markdown(f"<b>{response}</b>"))
response = query_engine.query("What did the author do growing up?")
display(Markdown(f"{response}"))

Based on the context information provided, the author's activities growing up were:

Writing short stories, which were "awful" and had "hardly any plot."
Programming on an IBM 1401 computer in 9th grade, using an early version of Fortran language.
Building simple games, a program to predict the height of model rockets, and a word processor for his father.
Reading science fiction novels, such as "The Moon is a Harsh Mistress" by Heinlein, which inspired him to work on AI.
Living in Florence, Italy, and walking through the city's streets to the Accademia.

Please note that these activities are mentioned in the text and are not based on prior knowledge or assumptions.

流支持¶

In [ ]:

Copied!





query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("What happened at interleaf?")
for token in response.response_gen:
    print(token, end="")
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("What happened at interleaf?")
for token in response.response_gen:
    print(token, end="")

 Based on the context information provided, it appears that the author worked at Interleaf, a company that made software for creating and managing documents. The author mentions that Interleaf was "on the way down" and that the company's Release Engineering group was large compared to the group that actually wrote the software. It is inferred that Interleaf was experiencing financial difficulties and that the author was nervous about money. However, there is no explicit mention of what specifically happened at Interleaf.