Neo4j属性图索引¶
Neo4j是一个生产级的图数据库,能够存储属性图,执行向量搜索、过滤等操作。
最简单的入门方式是使用Neo4j Aura提供的云托管实例。
对于本笔记本,我们将介绍如何使用docker在本地运行数据库。
如果您已经有现有的图数据库,请直接跳到本笔记本的末尾。
In [ ]:
Copied!
%pip install llama-index llama-index-graph-stores-neo4j
%pip install llama-index llama-index-graph-stores-neo4j
Docker设置¶
要在本地启动Neo4j,请确保已安装docker。然后,您可以使用以下docker命令启动数据库
docker run \
-p 7474:7474 -p 7687:7687 \
-v $PWD/data:/data -v $PWD/plugins:/plugins \
--name neo4j-apoc \
-e NEO4J_apoc_export_file_enabled=true \
-e NEO4J_apoc_import_file_enabled=true \
-e NEO4J_apoc_import_file_use__neo4j__config=true \
-e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
neo4j:latest
从这里,您可以在http://localhost:7474/打开数据库。在此页面上,您将被要求登录。使用默认的用户名/密码 neo4j
和 neo4j
。
第一次登录后,您将被要求更改密码。
之后,您就可以准备创建您的第一个属性图了!
环境设置¶
我们只需要进行一些环境设置就可以开始了。
In [ ]:
Copied!
import os
os.environ["OPENAI_API_KEY"] = "sk-proj-..."
import os
os.environ["OPENAI_API_KEY"] = "sk-proj-..."
In [ ]:
Copied!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
In [ ]:
Copied!
import nest_asyncio
nest_asyncio.apply()
import nest_asyncio
nest_asyncio.apply()
In [ ]:
Copied!
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
索引构建¶
In [ ]:
Copied!
from llama_index.graph_stores.neo4j import Neo4jPGStore
graph_store = Neo4jPGStore(
username="neo4j",
password="<password>",
url="bolt://localhost:7687",
)
from llama_index.graph_stores.neo4j import Neo4jPGStore
graph_store = Neo4jPGStore(
username="neo4j",
password="",
url="bolt://localhost:7687",
)
In [ ]:
Copied!
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
index = PropertyGraphIndex.from_documents(
documents,
llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
property_graph_store=graph_store,
show_progress=True,
)
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
index = PropertyGraphIndex.from_documents(
documents,
llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
property_graph_store=graph_store,
show_progress=True,
)
/Users/loganmarkewich/Library/Caches/pypoetry/virtualenvs/llama-index-bXUwlEfH-py3.11/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 26.07it/s] Extracting paths from text: 100%|██████████| 22/22 [00:10<00:00, 2.10it/s] Extracting implicit paths: 100%|██████████| 22/22 [00:00<00:00, 31418.01it/s] Generating embeddings: 100%|██████████| 1/1 [00:00<00:00, 1.22it/s] Generating embeddings: 100%|██████████| 5/5 [00:00<00:00, 7.05it/s]
现在图已经创建,我们可以通过访问http://localhost:7474/在UI中探索它。
查看整个图的最简单方法是在顶部使用类似"match n=() return n"
的cypher命令。
要删除整个图,一个有用的命令是"match n=() detach delete n"
。
查询和检索¶
In [ ]:
Copied!
# 检索器retriever = index.as_retriever( include_text=False, # 在返回的节点中包含源文本,默认为True)nodes = retriever.retrieve("Interleaf和Viaweb发生了什么事情?")for node in nodes: print(node.text)
# 检索器retriever = index.as_retriever( include_text=False, # 在返回的节点中包含源文本,默认为True)nodes = retriever.retrieve("Interleaf和Viaweb发生了什么事情?")for node in nodes: print(node.text)
Interleaf -> Got crushed by -> Moore's law Interleaf -> Made -> Scripting language Interleaf -> Had -> Smart people Interleaf -> Inspired by -> Emacs Interleaf -> Had -> Few years to live Interleaf -> Made -> Software Interleaf -> Had done -> Something bold Interleaf -> Added -> Scripting language Interleaf -> Built -> Impressive technology Interleaf -> Was -> Company Viaweb -> Was -> Profitable Viaweb -> Was -> Growing rapidly Viaweb -> Suggested -> Hospital Idea -> Was clear from -> Experience Idea -> Would have to be embodied as -> Company Painting department -> Seemed to be -> Rigorous
In [ ]:
Copied!
query_engine = index.as_query_engine(include_text=True)
response = query_engine.query("What happened at Interleaf and Viaweb?")
print(str(response))
query_engine = index.as_query_engine(include_text=True)
response = query_engine.query("What happened at Interleaf and Viaweb?")
print(str(response))
Interleaf had smart people and built impressive technology but got crushed by Moore's Law. Viaweb was profitable and growing rapidly.
从现有图中加载¶
如果你有一个已经存在的图(无论是使用LlamaIndex创建的还是其他方式创建的),我们可以连接并使用它!
注意: 如果你的图是在LlamaIndex之外创建的,最有用的检索器将是text to cypher或cypher templates。其他检索器依赖于LlamaIndex插入的属性。
In [ ]:
Copied!
from llama_index.graph_stores.neo4j import Neo4jPGStore
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
graph_store = Neo4jPGStore(
username="neo4j",
password="794613852",
url="bolt://localhost:7687",
)
index = PropertyGraphIndex.from_existing(
property_graph_store=graph_store,
llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
)
from llama_index.graph_stores.neo4j import Neo4jPGStore
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
graph_store = Neo4jPGStore(
username="neo4j",
password="794613852",
url="bolt://localhost:7687",
)
index = PropertyGraphIndex.from_existing(
property_graph_store=graph_store,
llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
)
从这里开始,我们仍然可以插入更多的文档!
In [ ]:
Copied!
from llama_index.core import Document
document = Document(text="LlamaIndex is great!")
index.insert(document)
from llama_index.core import Document
document = Document(text="LlamaIndex is great!")
index.insert(document)
In [ ]:
Copied!
nodes = index.as_retriever(include_text=False).retrieve("LlamaIndex")
print(nodes[0].text)
nodes = index.as_retriever(include_text=False).retrieve("LlamaIndex")
print(nodes[0].text)
Llamaindex -> Is -> Great
有关构建、检索和查询属性图的详细信息,请参阅完整文档页面。