索引

多模态LLM元数据 #

基类: BaseModel

参数:

名称	类型	描述	默认值
`context_window`	`int \| None`	模型在生成响应时可以输入的令牌总数。	`3900`
`num_output`	`int \| None`	模型在生成响应时可以输出的令牌数量。	`256`
`num_input_files`	`int \| None`	模型在生成响应时可以接受的输入文件数量。	`10`
`is_function_calling_model`	`bool \| None`	如果模型支持类似OpenAI函数调用API的功能调用消息，则设为True。例如，将"给Anya发邮件询问她下周五是否想喝咖啡"转换为类似`send_email(to: string, body: string)`的函数调用。	`False`
`model_name`	`str`	用于日志记录、测试和完整性检查的模型名称。对于某些模型，可以自动识别此名称。而对于其他模型，如本地加载的模型，则必须手动指定。	`'unknown'`
`is_chat_model`	`bool`	如果模型提供聊天界面（即可以传递一系列消息而非纯文本），则设为True，例如OpenAI的/v1/chat/completions端点。	`False`

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

class MultiModalLLMMetadata(BaseModel):
    model_config = ConfigDict(protected_namespaces=("pydantic_model_",))
    context_window: Optional[int] = Field(
        default=DEFAULT_CONTEXT_WINDOW,
        description=(
            "Total number of tokens the model can be input when generating a response."
        ),
    )
    num_output: Optional[int] = Field(
        default=DEFAULT_NUM_OUTPUTS,
        description="Number of tokens the model can output when generating a response.",
    )
    num_input_files: Optional[int] = Field(
        default=DEFAULT_NUM_INPUT_FILES,
        description="Number of input files the model can take when generating a response.",
    )
    is_function_calling_model: Optional[bool] = Field(
        default=False,
        # SEE: https://openai.com/blog/function-calling-and-other-api-updates
        description=(
            "Set True if the model supports function calling messages, similar to"
            " OpenAI's function calling API. For example, converting 'Email Anya to"
            " see if she wants to get coffee next Friday' to a function call like"
            " `send_email(to: string, body: string)`."
        ),
    )
    model_name: str = Field(
        default="unknown",
        description=(
            "The model's name used for logging, testing, and sanity checking. For some"
            " models this can be automatically discerned. For other models, like"
            " locally loaded models, this must be manually specified."
        ),
    )

    is_chat_model: bool = Field(
        default=False,
        description=(
            "Set True if the model exposes a chat interface (i.e. can be passed a"
            " sequence of messages, rather than text), like OpenAI's"
            " /v1/chat/completions endpoint."
        ),
    )

多模态LLM #

基类: ChainableMixin, BaseComponent, DispatcherSpanMixin

多模态LLM接口。

参数:

名称	类型	描述	默认值
`callback_manager`	`CallbackManager`	处理LlamaIndex内部事件回调的回调管理器。回调管理器提供了一种在事件开始/结束时调用处理程序的方式。此外，回调管理器会追踪当前的事件堆栈。它通过以下几个关键属性实现这一功能： - trace_stack - 当前尚未结束的事件堆栈。当某个事件结束时，会从堆栈中移除。由于这是一个上下文变量，每个线程/任务都拥有独立的堆栈。 - trace_map - 事件ID到其子事件的映射关系。在事件开始时，会使用trace_stack底部的事件作为trace_map中的当前父事件。 - trace_id - 当前追踪的简单名称，通常表示入口点（如query查询、index_construction索引构建、insert插入等）参数: handlers (List[BaseCallbackHandler]): 要使用的处理器列表。用法： with callback_manager.event(CBEventType.QUERY) as event: event.on_start(payload={key, val}) ... event.on_end(payload={key, val})	`<dynamic>`

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

class MultiModalLLM(ChainableMixin, BaseComponent, DispatcherSpanMixin):
    """Multi-Modal LLM interface."""

    model_config = ConfigDict(arbitrary_types_allowed=True)
    callback_manager: CallbackManager = Field(
        default_factory=CallbackManager, exclude=True
    )

    def __init__(self, *args: Any, **kwargs: Any) -> None:
        # Help static checkers understand this class hierarchy
        super().__init__(*args, **kwargs)

    @property
    @abstractmethod
    def metadata(self) -> MultiModalLLMMetadata:
        """Multi-Modal LLM metadata."""

    @abstractmethod
    def complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponse:
        """Completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    def stream_complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponseGen:
        """Streaming completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    def chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponse:
        """Chat endpoint for Multi-Modal LLM."""

    @abstractmethod
    def stream_chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponseGen:
        """Stream chat endpoint for Multi-Modal LLM."""

    # ===== Async Endpoints =====

    @abstractmethod
    async def acomplete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponse:
        """Async completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def astream_complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponseAsyncGen:
        """Async streaming completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def achat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponse:
        """Async chat endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def astream_chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponseAsyncGen:
        """Async streaming chat endpoint for Multi-Modal LLM."""

    def _as_query_component(self, **kwargs: Any) -> QueryComponent:
        """Return query component."""
        if self.metadata.is_chat_model:
            # TODO: we don't have a separate chat component
            return MultiModalCompleteComponent(multi_modal_llm=self, **kwargs)
        else:
            return MultiModalCompleteComponent(multi_modal_llm=self, **kwargs)

    def __init_subclass__(cls, **kwargs: Any) -> None:
        """
        The callback decorators installs events, so they must be applied before
        the span decorators, otherwise the spans wouldn't contain the events.
        """
        for attr in (
            "complete",
            "acomplete",
            "stream_complete",
            "astream_complete",
            "chat",
            "achat",
            "stream_chat",
            "astream_chat",
        ):
            if callable(method := cls.__dict__.get(attr)):
                if attr.endswith("chat"):
                    setattr(cls, attr, llm_chat_callback()(method))
                else:
                    setattr(cls, attr, llm_completion_callback()(method))
        super().__init_subclass__(**kwargs)

元数据 `abstractmethod` `property` #

metadata: MultiModalLLMMetadata

多模态LLM元数据。

完成 `abstractmethod` #

complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponse

多模态LLM的完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
def complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponse:
    """Completion endpoint for Multi-Modal LLM."""

stream_complete `abstractmethod` #

stream_complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponseGen

多模态LLM的流式完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
def stream_complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponseGen:
    """Streaming completion endpoint for Multi-Modal LLM."""

聊天 `abstractmethod` #

chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

多模态大语言模型的聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
def chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponse:
    """Chat endpoint for Multi-Modal LLM."""

stream_chat `abstractmethod` #

stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseGen

多模态LLM的流式聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
def stream_chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponseGen:
    """Stream chat endpoint for Multi-Modal LLM."""

acomplete `abstractmethod` `async` #

acomplete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponse

多模态LLM的异步完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
async def acomplete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponse:
    """Async completion endpoint for Multi-Modal LLM."""

astream_complete `abstractmethod` `async` #

astream_complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponseAsyncGen

多模态LLM的异步流式完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
async def astream_complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponseAsyncGen:
    """Async streaming completion endpoint for Multi-Modal LLM."""

聊天 `abstractmethod` `async` #

achat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

多模态LLM的异步聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
async def achat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponse:
    """Async chat endpoint for Multi-Modal LLM."""

astream_chat `abstractmethod` `async` #

astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseAsyncGen

多模态LLM的异步流式聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

@abstractmethod
async def astream_chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponseAsyncGen:
    """Async streaming chat endpoint for Multi-Modal LLM."""

BaseMultiModalComponent #

基类: QueryComponent

基础LLM组件。

参数:

名称	类型	描述	默认值
`multi_modal_llm`	`MultiModalLLM`	LLM	required
`streaming`	`bool`	流式模式	`False`

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

class BaseMultiModalComponent(QueryComponent):
    """Base LLM component."""

    model_config = ConfigDict(arbitrary_types_allowed=True)
    multi_modal_llm: MultiModalLLM = Field(..., description="LLM")
    streaming: bool = Field(default=False, description="Streaming mode")

    def set_callback_manager(self, callback_manager: Any) -> None:
        """Set callback manager."""

set_callback_manager #

set_callback_manager(callback_manager: Any) -> None

设置回调管理器。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

def set_callback_manager(self, callback_manager: Any) -> None:
    """Set callback manager."""

多模态完整组件 #

基类: BaseMultiModalComponent

多模态补全组件。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py

class MultiModalCompleteComponent(BaseMultiModalComponent):
    """Multi-modal completion component."""

    def _validate_component_inputs(self, input: Dict[str, Any]) -> Dict[str, Any]:
        """Validate component inputs during run_component."""
        if "prompt" not in input:
            raise ValueError("Prompt must be in input dict.")

        # do special check to see if prompt is a list of chat messages
        if isinstance(input["prompt"], get_args(List[ChatMessage])):
            raise NotImplementedError(
                "Chat messages not yet supported as input to multi-modal model."
            )
        else:
            input["prompt"] = validate_and_convert_stringable(input["prompt"])

        # make sure image documents are valid
        if "image_documents" in input:
            if not isinstance(input["image_documents"], list):
                raise ValueError("image_documents must be a list.")
            for doc in input["image_documents"]:
                if not isinstance(doc, (ImageDocument, ImageNode)):
                    raise ValueError(
                        "image_documents must be a list of ImageNode objects."
                    )

        return input

    def _run_component(self, **kwargs: Any) -> Any:
        """Run component."""
        # TODO: support only complete for now
        prompt = kwargs["prompt"]
        image_documents = kwargs.get("image_documents", [])

        response: Any
        if self.streaming:
            response = self.multi_modal_llm.stream_complete(prompt, image_documents)
        else:
            response = self.multi_modal_llm.complete(prompt, image_documents)
        return {"output": response}

    async def _arun_component(self, **kwargs: Any) -> Any:
        """Run component."""
        # TODO: support only complete for now
        # non-trivial to figure how to support chat/complete/etc.
        prompt = kwargs["prompt"]
        image_documents = kwargs.get("image_documents", [])

        response: Any
        if self.streaming:
            response = await self.multi_modal_llm.astream_complete(
                prompt, image_documents
            )
        else:
            response = await self.multi_modal_llm.acomplete(prompt, image_documents)
        return {"output": response}

    @property
    def input_keys(self) -> InputKeys:
        """Input keys."""
        # TODO: support only complete for now
        return InputKeys.from_keys({"prompt", "image_documents"})

    @property
    def output_keys(self) -> OutputKeys:
        """Output keys."""
        return OutputKeys.from_keys({"output"})

输入键 `property` #

input_keys: InputKeys

输入键。

输出键 `property` #

output_keys: OutputKeys

输出键。

索引

多模态LLM元数据 #

多模态LLM #

元数据 abstractmethod property #

完成 abstractmethod #

stream_complete abstractmethod #

聊天 abstractmethod #

stream_chat abstractmethod #

acomplete abstractmethod async #

astream_complete abstractmethod async #

聊天 abstractmethod async #

astream_chat abstractmethod async #

BaseMultiModalComponent #

set_callback_manager #

多模态完整组件 #

输入键 property #

输出键 property #

元数据 `abstractmethod` `property` #

完成 `abstractmethod` #

stream_complete `abstractmethod` #

聊天 `abstractmethod` #

stream_chat `abstractmethod` #

acomplete `abstractmethod` `async` #

astream_complete `abstractmethod` `async` #

聊天 `abstractmethod` `async` #

astream_chat `abstractmethod` `async` #

输入键 `property` #

输出键 `property` #