Skip to main content
Open In ColabOpen on GitHub

Needle 文档加载器

Needle 让您轻松创建 RAG 管道,只需最少的努力。

更多详情,请参阅我们的API文档

概述

Needle 文档加载器是一个用于将 Needle 集合与 LangChain 集成的工具。它能够无缝地存储、检索和利用文档,以支持检索增强生成(RAG)工作流。

这个示例演示了:

  • 将文档存储到Needle集合中。
  • 设置一个检索器以获取文档。
  • 构建一个检索增强生成(RAG)管道。

设置

在开始之前,请确保您已设置以下环境变量:

  • NEEDLE_API_KEY: 用于与Needle进行身份验证的API密钥。
  • OPENAI_API_KEY: 用于语言模型操作的OpenAI API密钥。
import os
os.environ["NEEDLE_API_KEY"] = ""
os.environ["OPENAI_API_KEY"] = ""

初始化

要初始化NeedleLoader,您需要以下参数:

  • needle_api_key: 您的 Needle API 密钥(或将其设置为环境变量)。
  • collection_id: 要操作的Needle集合的ID。

实例化

from langchain_community.document_loaders.needle import NeedleLoader

collection_id = "clt_01J87M9T6B71DHZTHNXYZQRG5H"

# Initialize NeedleLoader to store documents to the collection
document_loader = NeedleLoader(
needle_api_key=os.getenv("NEEDLE_API_KEY"),
collection_id=collection_id,
)
API Reference:NeedleLoader

加载

将文件添加到Needle集合中:

files = {
"tech-radar-30.pdf": "https://www.thoughtworks.com/content/dam/thoughtworks/documents/radar/2024/04/tr_technology_radar_vol_30_en.pdf"
}

document_loader.add_files(files=files)
# Show the documents in the collection
# collections_documents = document_loader.load()

懒加载

lazy_load 方法允许您从 Needle 集合中迭代加载文档,并在获取每个文档时生成它:

# Show the documents in the collection
# collections_documents = document_loader.lazy_load()

用法

在链中使用

下面是一个在链中使用Needle设置RAG管道的完整示例:

import os

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_community.retrievers.needle import NeedleRetriever
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(temperature=0)

# Initialize the Needle retriever (make sure your Needle API key is set as an environment variable)
retriever = NeedleRetriever(
needle_api_key=os.getenv("NEEDLE_API_KEY"),
collection_id="clt_01J87M9T6B71DHZTHNXYZQRG5H",
)

# Define system prompt for the assistant
system_prompt = """
You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know, say so concisely.\n\n{context}
"""

prompt = ChatPromptTemplate.from_messages(
[("system", system_prompt), ("human", "{input}")]
)

# Define the question-answering chain using a document chain (stuff chain) and the retriever
question_answer_chain = create_stuff_documents_chain(llm, prompt)

# Create the RAG (Retrieval-Augmented Generation) chain by combining the retriever and the question-answering chain
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

# Define the input query
query = {"input": "Did RAG move to accepted?"}

response = rag_chain.invoke(query)

response
{'input': 'Did RAG move to accepted?',
'context': [Document(metadata={}, page_content='New Moved in/out No change\n\n© Thoughtworks, Inc. All Rights Reserved. 12\n\nTechniques\n\n1. Retrieval-augmented generation (RAG)\nAdopt\n\nRetrieval-augmented generation (RAG) is the preferred pattern for our teams to improve the quality of \nresponses generated by a large language model (LLM). We’ve successfully used it in several projects, \nincluding the popular Jugalbandi AI Platform. With RAG, information about relevant and trustworthy \ndocuments — in formats like HTML and PDF — are stored in databases that supports a vector data \ntype or efficient document search, such as pgvector, Qdrant or Elasticsearch Relevance Engine. For \na given prompt, the database is queried to retrieve relevant documents, which are then combined \nwith the prompt to provide richer context to the LLM. This results in higher quality output and greatly \nreduced hallucinations. The context window — which determines the maximum size of the LLM input \n— is limited, which means that selecting the most relevant documents is crucial. We improve the \nrelevancy of the content that is added to the prompt by reranking. Similarly, the documents are usually \ntoo large to calculate an embedding, which means they must be split into smaller chunks. This is often \na difficult problem, and one approach is to have the chunks overlap to a certain extent.'),
Document(metadata={}, page_content='New Moved in/out No change\n\n© Thoughtworks, Inc. All Rights Reserved. 12\n\nTechniques\n\n1. Retrieval-augmented generation (RAG)\nAdopt\n\nRetrieval-augmented generation (RAG) is the preferred pattern for our teams to improve the quality of \nresponses generated by a large language model (LLM). We’ve successfully used it in several projects, \nincluding the popular Jugalbandi AI Platform. With RAG, information about relevant and trustworthy \ndocuments — in formats like HTML and PDF — are stored in databases that supports a vector data \ntype or efficient document search, such as pgvector, Qdrant or Elasticsearch Relevance Engine. For \na given prompt, the database is queried to retrieve relevant documents, which are then combined \nwith the prompt to provide richer context to the LLM. This results in higher quality output and greatly \nreduced hallucinations. The context window — which determines the maximum size of the LLM input \n— is limited, which means that selecting the most relevant documents is crucial. We improve the \nrelevancy of the content that is added to the prompt by reranking. Similarly, the documents are usually \ntoo large to calculate an embedding, which means they must be split into smaller chunks. This is often \na difficult problem, and one approach is to have the chunks overlap to a certain extent.'),
Document(metadata={}, page_content='New Moved in/out No change\n\n© Thoughtworks, Inc. All Rights Reserved. 12\n\nTechniques\n\n1. Retrieval-augmented generation (RAG)\nAdopt\n\nRetrieval-augmented generation (RAG) is the preferred pattern for our teams to improve the quality of \nresponses generated by a large language model (LLM). We’ve successfully used it in several projects, \nincluding the popular Jugalbandi AI Platform. With RAG, information about relevant and trustworthy \ndocuments — in formats like HTML and PDF — are stored in databases that supports a vector data \ntype or efficient document search, such as pgvector, Qdrant or Elasticsearch Relevance Engine. For \na given prompt, the database is queried to retrieve relevant documents, which are then combined \nwith the prompt to provide richer context to the LLM. This results in higher quality output and greatly \nreduced hallucinations. The context window — which determines the maximum size of the LLM input \n— is limited, which means that selecting the most relevant documents is crucial. We improve the \nrelevancy of the content that is added to the prompt by reranking. Similarly, the documents are usually \ntoo large to calculate an embedding, which means they must be split into smaller chunks. This is often \na difficult problem, and one approach is to have the chunks overlap to a certain extent.'),
Document(metadata={}, page_content='New Moved in/out No change\n\n© Thoughtworks, Inc. All Rights Reserved. 12\n\nTechniques\n\n1. Retrieval-augmented generation (RAG)\nAdopt\n\nRetrieval-augmented generation (RAG) is the preferred pattern for our teams to improve the quality of \nresponses generated by a large language model (LLM). We’ve successfully used it in several projects, \nincluding the popular Jugalbandi AI Platform. With RAG, information about relevant and trustworthy \ndocuments — in formats like HTML and PDF — are stored in databases that supports a vector data \ntype or efficient document search, such as pgvector, Qdrant or Elasticsearch Relevance Engine. For \na given prompt, the database is queried to retrieve relevant documents, which are then combined \nwith the prompt to provide richer context to the LLM. This results in higher quality output and greatly \nreduced hallucinations. The context window — which determines the maximum size of the LLM input \n— is limited, which means that selecting the most relevant documents is crucial. We improve the \nrelevancy of the content that is added to the prompt by reranking. Similarly, the documents are usually \ntoo large to calculate an embedding, which means they must be split into smaller chunks. This is often \na difficult problem, and one approach is to have the chunks overlap to a certain extent.'),
Document(metadata={}, page_content='New Moved in/out No change\n\n© Thoughtworks, Inc. All Rights Reserved. 12\n\nTechniques\n\n1. Retrieval-augmented generation (RAG)\nAdopt\n\nRetrieval-augmented generation (RAG) is the preferred pattern for our teams to improve the quality of \nresponses generated by a large language model (LLM). We’ve successfully used it in several projects, \nincluding the popular Jugalbandi AI Platform. With RAG, information about relevant and trustworthy \ndocuments — in formats like HTML and PDF — are stored in databases that supports a vector data \ntype or efficient document search, such as pgvector, Qdrant or Elasticsearch Relevance Engine. For \na given prompt, the database is queried to retrieve relevant documents, which are then combined \nwith the prompt to provide richer context to the LLM. This results in higher quality output and greatly \nreduced hallucinations. The context window — which determines the maximum size of the LLM input \n— is limited, which means that selecting the most relevant documents is crucial. We improve the \nrelevancy of the content that is added to the prompt by reranking. Similarly, the documents are usually \ntoo large to calculate an embedding, which means they must be split into smaller chunks. This is often \na difficult problem, and one approach is to have the chunks overlap to a certain extent.')],
'answer': 'Yes, RAG has been adopted as the preferred pattern for improving the quality of responses generated by a large language model.'}

API 参考

有关所有Needle功能和配置的详细文档,请访问API参考:https://docs.needle-ai.com


这个页面有帮助吗?