Skip to content

Node parser semantic chunking

SemanticChunkingQueryEnginePack #

Bases: BaseLlamaPack

语义分块查询引擎包。

接收一个文档列表,使用语义嵌入分块器对其进行解析,并在生成的分块上运行查询引擎。

Source code in llama_index/packs/node_parser_semantic_chunking/base.py
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
class SemanticChunkingQueryEnginePack(BaseLlamaPack):
    """语义分块查询引擎包。

    接收一个文档列表,使用语义嵌入分块器对其进行解析,并在生成的分块上运行查询引擎。"""

    def __init__(
        self,
        documents: List[Document],
        buffer_size: int = 1,
        breakpoint_percentile_threshold: float = 95.0,
    ) -> None:
        """初始化参数。"""
        self.embed_model = OpenAIEmbedding()
        self.splitter = SemanticChunker(
            buffer_size=buffer_size,
            breakpoint_percentile_threshold=breakpoint_percentile_threshold,
            embed_model=self.embed_model,
        )

        nodes = self.splitter.get_nodes_from_documents(documents)
        self.vector_index = VectorStoreIndex(nodes)
        self.query_engine = self.vector_index.as_query_engine()

    def get_modules(self) -> Dict[str, Any]:
        return {
            "vector_index": self.vector_index,
            "query_engine": self.query_engine,
            "splitter": self.splitter,
            "embed_model": self.embed_model,
        }

    def run(self, query: str) -> Any:
        """运行流水线。"""
        return self.query_engine.query(query)

run #

run(query: str) -> Any

运行流水线。

Source code in llama_index/packs/node_parser_semantic_chunking/base.py
228
229
230
def run(self, query: str) -> Any:
    """运行流水线。"""
    return self.query_engine.query(query)