Cloudflare Workers AI Embeddings¶
设置¶
通过pip安装库
In [ ]:
Copied!
%pip install llama-index-embeddings-cloudflare-workersai
# %pip install -e ~/llama_index/llama-index-integrations/embeddings/llama-index-embeddings-cloudflare-workersai
%pip install llama-index-embeddings-cloudflare-workersai
# %pip install -e ~/llama_index/llama-index-integrations/embeddings/llama-index-embeddings-cloudflare-workersai
要访问Cloudflare Workers AI,需要Cloudflare账户ID和API令牌。要获取您的账户ID和API令牌,请按照此文档上的说明操作。
In [ ]:
Copied!
# 使用账户ID和API令牌进行初始化
# 导入os模块
# 我的账户ID = "example_id"
# 我的API令牌 = "example_token"
# os.environ["CLOUDFLARE_AUTH_TOKEN"] = "my_api_token"
import getpass
my_account_id = getpass.getpass("输入您的Cloudflare账户ID:\n\n")
my_api_token = getpass.getpass("输入您的Cloudflare API令牌:\n\n")
# 使用账户ID和API令牌进行初始化
# 导入os模块
# 我的账户ID = "example_id"
# 我的API令牌 = "example_token"
# os.environ["CLOUDFLARE_AUTH_TOKEN"] = "my_api_token"
import getpass
my_account_id = getpass.getpass("输入您的Cloudflare账户ID:\n\n")
my_api_token = getpass.getpass("输入您的Cloudflare API令牌:\n\n")
文本嵌入示例¶
In [ ]:
Copied!
from llama_index.embeddings.cloudflare_workersai import CloudflareEmbedding
my_embed = CloudflareEmbedding(
account_id=my_account_id,
auth_token=my_api_token,
model="@cf/baai/bge-small-en-v1.5",
)
embeddings = my_embed.get_text_embedding("Why sky is blue")
print(len(embeddings))
print(embeddings[:5])
from llama_index.embeddings.cloudflare_workersai import CloudflareEmbedding
my_embed = CloudflareEmbedding(
account_id=my_account_id,
auth_token=my_api_token,
model="@cf/baai/bge-small-en-v1.5",
)
embeddings = my_embed.get_text_embedding("Why sky is blue")
print(len(embeddings))
print(embeddings[:5])
384 [-0.04786296561360359, -0.030788540840148926, -0.07126234471797943, -0.04107927531003952, 0.02904760278761387]
分批嵌入¶
关于批处理大小,Cloudflare的限制最大为100,截止日期为2024年3月31日。
In [ ]:
Copied!
embeddings = my_embed.get_text_embedding_batch(
["Why sky is blue", "Why roses are red"]
)
print(len(embeddings))
print(len(embeddings[0]))
print(embeddings[0][:5])
print(embeddings[1][:5])
embeddings = my_embed.get_text_embedding_batch(
["Why sky is blue", "Why roses are red"]
)
print(len(embeddings))
print(len(embeddings[0]))
print(embeddings[0][:5])
print(embeddings[1][:5])
2 384 [-0.04786296561360359, -0.030788540840148926, -0.07126234471797943, -0.04107927531003952, 0.02904760278761387] [-0.08951402455568314, -0.015274363569915295, 0.04728245735168457, 0.05478525161743164, 0.05978189781308174]