如果您在colab上打开这个笔记本,您可能需要安装LlamaIndex 🦙。
In [ ]:
Copied!
%pip install llama-index-embeddings-huggingface
%pip install llama-index-embeddings-instructor
%pip install llama-index-embeddings-huggingface
%pip install llama-index-embeddings-instructor
In [ ]:
Copied!
!pip install llama-index
!pip install llama-index
In [ ]:
Copied!
来自llama_index.embeddings.huggingface的HuggingFaceEmbedding# 加载BAAI/bge-small-en# embed_model = HuggingFaceEmbedding()# 加载BAAI/bge-small-en-v1.5embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
来自llama_index.embeddings.huggingface的HuggingFaceEmbedding# 加载BAAI/bge-small-en# embed_model = HuggingFaceEmbedding()# 加载BAAI/bge-small-en-v1.5embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")
In [ ]:
Copied!
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
Hello World! 384 [-0.030880315229296684, -0.11021008342504501, 0.3917851448059082, -0.35962796211242676, 0.22797748446464539]
InstructorEmbedding¶
教师嵌入是一类特别训练的嵌入,根据指示来增强它们的嵌入。默认情况下,查询使用 query_instruction="表示用于检索支持文档的问题:"
,文本使用 text_instruction="表示用于检索的文档:"
。
它们依赖于 Instructor
和 SentenceTransformers
(版本2.2.2)pip包,您可以使用 pip install InstructorEmbedding
和 pip install -U sentence-transformers==2.2.2
进行安装。
In [ ]:
Copied!
from llama_index.embeddings.instructor import InstructorEmbedding
embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")
from llama_index.embeddings.instructor import InstructorEmbedding
embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/InstructorEmbedding/instructor.py:7: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console) from tqdm.autonotebook import trange
load INSTRUCTOR_Transformer
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")
max_seq_length 512
In [ ]:
Copied!
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
768 [ 0.02155361 -0.06098218 0.01796207 0.05490903 0.01526906]
OptimumEmbedding¶
Optimum是HuggingFace库中用于将HuggingFace模型导出并在ONNX格式中运行的工具。
您可以使用pip install transformers optimum[exporters]
安装依赖项。
首先,我们需要创建ONNX模型。ONNX模型提供了更快的推理速度,并且可以跨平台使用(例如在TransformersJS中)。
In [ ]:
Copied!
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding
OptimumEmbedding.create_and_save_optimum_model(
"BAAI/bge-small-en-v1.5", "./bge_onnx"
)
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding
OptimumEmbedding.create_and_save_optimum_model(
"BAAI/bge-small-en-v1.5", "./bge_onnx"
)
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Framework not specified. Using pt to export to ONNX. Using the export variant default. Available variants are: - default: The default ONNX variant. Using framework PyTorch: 2.0.1+cu117 Overriding 1 configuration item(s) - use_cache -> False
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 ============= verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ======================== Saved optimum model to ./bge_onnx. Use it with `embed_model = OptimumEmbedding(folder_name='./bge_onnx')`.
In [ ]:
Copied!
embed_model = OptimumEmbedding(folder_name="./bge_onnx")
embed_model = OptimumEmbedding(folder_name="./bge_onnx")
In [ ]:
Copied!
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
384 [-0.10364960134029388, -0.20998482406139374, -0.01883639395236969, -0.5241696834564209, 0.0335749015212059]
基准测试¶
让我们尝试比较使用经典的大型文档——《IPCC气候报告》第3章。
In [ ]:
Copied!
!curl https://www.ipcc.ch/report/ar6/wg2/downloads/report/IPCC_AR6_WGII_Chapter03.pdf --output IPCC_AR6_WGII_Chapter03.pdf
!curl https://www.ipcc.ch/report/ar6/wg2/downloads/report/IPCC_AR6_WGII_Chapter03.pdf --output IPCC_AR6_WGII_Chapter03.pdf
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 20.7M 100 20.7M 0 0 16.5M 0 0:00:01 0:00:01 --:--:-- 16.5M
In [ ]:
Copied!
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import Settings
documents = SimpleDirectoryReader(
input_files=["IPCC_AR6_WGII_Chapter03.pdf"]
).load_data()
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import Settings
documents = SimpleDirectoryReader(
input_files=["IPCC_AR6_WGII_Chapter03.pdf"]
).load_data()
在这个示例中,我们将使用HuggingFace库中的预训练模型来生成文本嵌入。这些嵌入可以用于文本分类、相似度匹配等任务。我们将使用transformers
库来加载预训练模型,并使用它来生成文本嵌入。
In [ ]:
Copied!
import osimport openai# 之后需要合成回复os.environ["OPENAI_API_KEY"] = "sk-..."openai.api_key = os.environ["OPENAI_API_KEY"]
import osimport openai# 之后需要合成回复os.environ["OPENAI_API_KEY"] = "sk-..."openai.api_key = os.environ["OPENAI_API_KEY"]
In [ ]:
Copied!
from llama_index.embeddings.huggingface import HuggingFaceEmbedding# 加载BAAI/bge-small-en-v1.5embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")test_emeds = embed_model.get_text_embedding("Hello World!")Settings.embed_model = embed_model
from llama_index.embeddings.huggingface import HuggingFaceEmbedding# 加载BAAI/bge-small-en-v1.5embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")test_emeds = embed_model.get_text_embedding("Hello World!")Settings.embed_model = embed_model
In [ ]:
Copied!
%%timeit -r 1 -n 1
index = VectorStoreIndex.from_documents(documents, show_progress=True)
%%timeit -r 1 -n 1
index = VectorStoreIndex.from_documents(documents, show_progress=True)
Parsing documents into nodes: 0%| | 0/172 [00:00<?, ?it/s]
Generating embeddings: 0%| | 0/428 [00:00<?, ?it/s]
1min 27s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
最佳嵌入¶
我们可以使用之前创建的onnx嵌入。
In [ ]:
Copied!
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding
embed_model = OptimumEmbedding(folder_name="./bge_onnx")
test_emeds = embed_model.get_text_embedding("Hello World!")
Settings.embed_model = embed_model
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding
embed_model = OptimumEmbedding(folder_name="./bge_onnx")
test_emeds = embed_model.get_text_embedding("Hello World!")
Settings.embed_model = embed_model
In [ ]:
Copied!
%%timeit -r 1 -n 1
index = VectorStoreIndex.from_documents(documents, show_progress=True)
%%timeit -r 1 -n 1
index = VectorStoreIndex.from_documents(documents, show_progress=True)
Parsing documents into nodes: 0%| | 0/172 [00:00<?, ?it/s]
Generating embeddings: 0%| | 0/428 [00:00<?, ?it/s]
1min 9s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)