Neo4j属性图索引¶

Neo4j是一个生产级的图数据库，能够存储属性图，执行向量搜索、过滤等操作。

最简单的入门方式是使用Neo4j Aura提供的云托管实例。

对于本笔记本，我们将介绍如何使用docker在本地运行数据库。

如果您已经有现有的图数据库，请直接跳到本笔记本的末尾。

In [ ]:

Copied!

%pip install llama-index llama-index-graph-stores-neo4j
%pip install llama-index llama-index-graph-stores-neo4j

Docker设置¶

要在本地启动Neo4j，请确保已安装docker。然后，您可以使用以下docker命令启动数据库

docker run \
    -p 7474:7474 -p 7687:7687 \
    -v $PWD/data:/data -v $PWD/plugins:/plugins \
    --name neo4j-apoc \
    -e NEO4J_apoc_export_file_enabled=true \
    -e NEO4J_apoc_import_file_enabled=true \
    -e NEO4J_apoc_import_file_use__neo4j__config=true \
    -e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
    neo4j:latest

从这里，您可以在http://localhost:7474/打开数据库。在此页面上，您将被要求登录。使用默认的用户名/密码 neo4j 和 neo4j。

第一次登录后，您将被要求更改密码。

之后，您就可以准备创建您的第一个属性图了！

环境设置¶

我们只需要进行一些环境设置就可以开始了。

In [ ]:

Copied!

import os

os.environ["OPENAI_API_KEY"] = "sk-proj-..."
import os

os.environ["OPENAI_API_KEY"] = "sk-proj-..."

In [ ]:

Copied!

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [ ]:

Copied!

import nest_asyncio

nest_asyncio.apply()
import nest_asyncio

nest_asyncio.apply()

In [ ]:

Copied!

from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

索引构建¶

In [ ]:

Copied!





from llama_index.graph_stores.neo4j import Neo4jPGStore

graph_store = Neo4jPGStore(
    username="neo4j",
    password="<password>",
    url="bolt://localhost:7687",
)
from llama_index.graph_stores.neo4j import Neo4jPGStore

graph_store = Neo4jPGStore(
    username="neo4j",
    password="",
    url="bolt://localhost:7687",
)

In [ ]:

Copied!





from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI

index = PropertyGraphIndex.from_documents(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
    property_graph_store=graph_store,
    show_progress=True,
)
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI

index = PropertyGraphIndex.from_documents(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
    property_graph_store=graph_store,
    show_progress=True,
)

/Users/loganmarkewich/Library/Caches/pypoetry/virtualenvs/llama-index-bXUwlEfH-py3.11/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 26.07it/s]
Extracting paths from text: 100%|██████████| 22/22 [00:10<00:00,  2.10it/s]
Extracting implicit paths: 100%|██████████| 22/22 [00:00<00:00, 31418.01it/s]
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.22it/s]
Generating embeddings: 100%|██████████| 5/5 [00:00<00:00,  7.05it/s]

现在图已经创建，我们可以通过访问http://localhost:7474/在UI中探索它。

查看整个图的最简单方法是在顶部使用类似"match n=() return n"的cypher命令。

要删除整个图，一个有用的命令是"match n=() detach delete n"。

查询和检索¶

In [ ]:

Copied!

# 检索器retriever = index.as_retriever(    include_text=False,  # 在返回的节点中包含源文本，默认为True)nodes = retriever.retrieve("Interleaf和Viaweb发生了什么事情？")for node in nodes:    print(node.text)
# 检索器retriever = index.as_retriever(    include_text=False,  # 在返回的节点中包含源文本，默认为True)nodes = retriever.retrieve("Interleaf和Viaweb发生了什么事情？")for node in nodes:    print(node.text)

Interleaf -> Got crushed by -> Moore's law
Interleaf -> Made -> Scripting language
Interleaf -> Had -> Smart people
Interleaf -> Inspired by -> Emacs
Interleaf -> Had -> Few years to live
Interleaf -> Made -> Software
Interleaf -> Had done -> Something bold
Interleaf -> Added -> Scripting language
Interleaf -> Built -> Impressive technology
Interleaf -> Was -> Company
Viaweb -> Was -> Profitable
Viaweb -> Was -> Growing rapidly
Viaweb -> Suggested -> Hospital
Idea -> Was clear from -> Experience
Idea -> Would have to be embodied as -> Company
Painting department -> Seemed to be -> Rigorous

In [ ]:

Copied!

query_engine = index.as_query_engine(include_text=True)

response = query_engine.query("What happened at Interleaf and Viaweb?")

print(str(response))
query_engine = index.as_query_engine(include_text=True)

response = query_engine.query("What happened at Interleaf and Viaweb?")

print(str(response))

Interleaf had smart people and built impressive technology but got crushed by Moore's Law. Viaweb was profitable and growing rapidly.

从现有图中加载¶

如果你有一个已经存在的图（无论是使用LlamaIndex创建的还是其他方式创建的），我们可以连接并使用它！

注意： 如果你的图是在LlamaIndex之外创建的，最有用的检索器将是text to cypher或cypher templates。其他检索器依赖于LlamaIndex插入的属性。

In [ ]:

Copied!





from llama_index.graph_stores.neo4j import Neo4jPGStore
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI

graph_store = Neo4jPGStore(
    username="neo4j",
    password="794613852",
    url="bolt://localhost:7687",
)

index = PropertyGraphIndex.from_existing(
    property_graph_store=graph_store,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
)
from llama_index.graph_stores.neo4j import Neo4jPGStore
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI

graph_store = Neo4jPGStore(
    username="neo4j",
    password="794613852",
    url="bolt://localhost:7687",
)

index = PropertyGraphIndex.from_existing(
    property_graph_store=graph_store,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
)

从这里开始，我们仍然可以插入更多的文档！

In [ ]:

Copied!

from llama_index.core import Document

document = Document(text="LlamaIndex is great!")

index.insert(document)
from llama_index.core import Document

document = Document(text="LlamaIndex is great!")

index.insert(document)

In [ ]:

Copied!

nodes = index.as_retriever(include_text=False).retrieve("LlamaIndex")

print(nodes[0].text)
nodes = index.as_retriever(include_text=False).retrieve("LlamaIndex")

print(nodes[0].text)

Llamaindex -> Is -> Great

有关构建、检索和查询属性图的详细信息，请参阅完整文档页面。