AnalyticDB¶
AnalyticDB for PostgreSQL是一个大规模并行处理(MPP)数据仓库服务,旨在在线分析大量数据。
要运行此笔记本,您需要在云中运行一个AnalyticDB for PostgreSQL实例(您可以在common-buy.aliyun.com获取一个实例)。
创建实例后,您应该通过API或在实例详细信息网页的“帐号管理”中创建一个管理账户。
您应该确保已安装llama-index
:
In [ ]:
Copied!
%pip install llama-index-vector-stores-analyticdb
%pip install llama-index-vector-stores-analyticdb
In [ ]:
Copied!
!pip install llama-index
!pip install llama-index
请提供参数:¶
In [ ]:
Copied!
import os
import getpass
# 阿里云RAM的AK和SK:
alibaba_cloud_ak = ""
alibaba_cloud_sk = ""
# 实例信息:
region_id = "cn-hangzhou" # 特定实例的区域ID
instance_id = "gp-xxxx" # ADB实例ID
account = "test_account" # 由API创建或在实例详细页面的“帐号管理”中创建的实例帐号名称
account_password = "" # 实例帐号密码
import os
import getpass
# 阿里云RAM的AK和SK:
alibaba_cloud_ak = ""
alibaba_cloud_sk = ""
# 实例信息:
region_id = "cn-hangzhou" # 特定实例的区域ID
instance_id = "gp-xxxx" # ADB实例ID
account = "test_account" # 由API创建或在实例详细页面的“帐号管理”中创建的实例帐号名称
account_password = "" # 实例帐号密码
导入所需的包依赖项:¶
In [ ]:
Copied!
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
)
from llama_index.vector_stores.analyticdb import AnalyticDBVectorStore
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
)
from llama_index.vector_stores.analyticdb import AnalyticDBVectorStore
AnalyticDB¶
AnalyticDB for PostgreSQL是一个大规模并行处理(MPP)数据仓库服务,旨在在线分析大量数据。
要运行此笔记本,您需要在云中运行一个AnalyticDB for PostgreSQL实例(您可以在common-buy.aliyun.com获取一个实例)。
创建实例后,您应该通过API或在实例详细信息网页的“帐号管理”中创建一个管理账户。
您应该确保已安装llama-index
:
In [ ]:
Copied!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
读取数据:¶
In [ ]:
Copied!
# 加载文档
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
print(f"总文档数:{len(documents)}")
print(f"第一篇文档,id:{documents[0].doc_id}")
print(f"第一篇文档,哈希值:{documents[0].hash}")
print(
"第一篇文档,文本"
f"({len(documents[0].text)} 个字符):\n{'='*20}\n{documents[0].text[:360]} ..."
)
# 加载文档
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
print(f"总文档数:{len(documents)}")
print(f"第一篇文档,id:{documents[0].doc_id}")
print(f"第一篇文档,哈希值:{documents[0].hash}")
print(
"第一篇文档,文本"
f"({len(documents[0].text)} 个字符):\n{'='*20}\n{documents[0].text[:360]} ..."
)
创建 AnalyticDB Vector 存储对象:¶
In [ ]:
Copied!
analytic_db_store = AnalyticDBVectorStore.from_params(
access_key_id=alibaba_cloud_ak,
access_key_secret=alibaba_cloud_sk,
region_id=region_id,
instance_id=instance_id,
account=account,
account_password=account_password,
namespace="llama",
collection="llama",
metrics="cosine",
embedding_dimension=1536,
)
analytic_db_store = AnalyticDBVectorStore.from_params(
access_key_id=alibaba_cloud_ak,
access_key_secret=alibaba_cloud_sk,
region_id=region_id,
instance_id=instance_id,
account=account,
account_password=account_password,
namespace="llama",
collection="llama",
metrics="cosine",
embedding_dimension=1536,
)
从文档中构建索引:¶
In [ ]:
Copied!
storage_context = StorageContext.from_defaults(vector_store=analytic_db_store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context
)
storage_context = StorageContext.from_defaults(vector_store=analytic_db_store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context
)
使用索引进行查询:¶
In [ ]:
Copied!
query_engine = index.as_query_engine()
response = query_engine.query("Why did the author choose to work on AI?")
print(response.response)
query_engine = index.as_query_engine()
response = query_engine.query("Why did the author choose to work on AI?")
print(response.response)
删除集合:¶
In [ ]:
Copied!
analytic_db_store.delete_collection()
analytic_db_store.delete_collection()