<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/embeddings/octoai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="在 Colab 中打开"/></a>
[![nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/aidoczh/llama_index_examples_zh/blob/main/examples/embeddings/octoai.ipynb)
首先,让我们安装LlamaIndex和OctoAI的依赖项。
In [ ]:
Copied!
%pip install llama-index-embeddings-octoai
%pip install llama-index-embeddings-octoai
In [ ]:
Copied!
!pip install llama-index
!pip install llama-index
In [ ]:
Copied!
OCTOAI_API_KEY = ""
OCTOAI_API_KEY = ""
我们可以在OctoAI上查询嵌入向量。
In [ ]:
Copied!
from llama_index.embeddings.octoai import OctoAIEmbedding
embed_model = OctoAIEmbedding(api_key=OCTOAI_API_KEY)
from llama_index.embeddings.octoai import OctoAIEmbedding
embed_model = OctoAIEmbedding(api_key=OCTOAI_API_KEY)
In [ ]:
Copied!
# 基本嵌入示例embeddings = embed_model.get_text_embedding("我怎样才能航行到月球?")print(len(embeddings), embeddings[:10])assert len(embeddings) == 1024
# 基本嵌入示例embeddings = embed_model.get_text_embedding("我怎样才能航行到月球?")print(len(embeddings), embeddings[:10])assert len(embeddings) == 1024
使用批处理嵌入¶
在处理大规模数据集时,使用批处理嵌入可以显著提高性能。在本教程中,我们将学习如何使用批处理嵌入来加速模型训练过程。
In [ ]:
Copied!
texts = [
"How do I sail to the moon?",
"What is the best way to cook a steak?",
"How do I apply for a job?",
]
embeddings = embed_model.get_text_embedding_batch(texts)
print(len(embeddings))
assert len(embeddings) == 3
assert len(embeddings[0]) == 1024
texts = [
"How do I sail to the moon?",
"What is the best way to cook a steak?",
"How do I apply for a job?",
]
embeddings = embed_model.get_text_embedding_batch(texts)
print(len(embeddings))
assert len(embeddings) == 3
assert len(embeddings[0]) == 1024