跳至内容

索引

多模态LLM元数据 #

基类: BaseModel

参数:

名称 类型 描述 默认值
context_window int | None

模型在生成响应时可以输入的令牌总数。

3900
num_output int | None

模型在生成响应时可以输出的令牌数量。

256
num_input_files int | None

模型在生成响应时可以接受的输入文件数量。

10
is_function_calling_model bool | None

如果模型支持类似OpenAI函数调用API的功能调用消息,则设为True。例如,将"给Anya发邮件询问她下周五是否想喝咖啡"转换为类似send_email(to: string, body: string)的函数调用。

False
model_name str

用于日志记录、测试和完整性检查的模型名称。对于某些模型,可以自动识别此名称。而对于其他模型,如本地加载的模型,则必须手动指定。

'unknown'
is_chat_model bool

如果模型提供聊天界面(即可以传递一系列消息而非纯文本),则设为True,例如OpenAI的/v1/chat/completions端点。

False
Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class MultiModalLLMMetadata(BaseModel):
    model_config = ConfigDict(protected_namespaces=("pydantic_model_",))
    context_window: Optional[int] = Field(
        default=DEFAULT_CONTEXT_WINDOW,
        description=(
            "Total number of tokens the model can be input when generating a response."
        ),
    )
    num_output: Optional[int] = Field(
        default=DEFAULT_NUM_OUTPUTS,
        description="Number of tokens the model can output when generating a response.",
    )
    num_input_files: Optional[int] = Field(
        default=DEFAULT_NUM_INPUT_FILES,
        description="Number of input files the model can take when generating a response.",
    )
    is_function_calling_model: Optional[bool] = Field(
        default=False,
        # SEE: https://openai.com/blog/function-calling-and-other-api-updates
        description=(
            "Set True if the model supports function calling messages, similar to"
            " OpenAI's function calling API. For example, converting 'Email Anya to"
            " see if she wants to get coffee next Friday' to a function call like"
            " `send_email(to: string, body: string)`."
        ),
    )
    model_name: str = Field(
        default="unknown",
        description=(
            "The model's name used for logging, testing, and sanity checking. For some"
            " models this can be automatically discerned. For other models, like"
            " locally loaded models, this must be manually specified."
        ),
    )

    is_chat_model: bool = Field(
        default=False,
        description=(
            "Set True if the model exposes a chat interface (i.e. can be passed a"
            " sequence of messages, rather than text), like OpenAI's"
            " /v1/chat/completions endpoint."
        ),
    )

多模态LLM #

基类: ChainableMixin, BaseComponent, DispatcherSpanMixin

多模态LLM接口。

参数:

名称 类型 描述 默认值
callback_manager CallbackManager

处理LlamaIndex内部事件回调的回调管理器。

回调管理器提供了一种在事件开始/结束时调用处理程序的方式。

此外,回调管理器会追踪当前的事件堆栈。 它通过以下几个关键属性实现这一功能: - trace_stack - 当前尚未结束的事件堆栈。 当某个事件结束时,会从堆栈中移除。 由于这是一个上下文变量,每个线程/任务都拥有独立的堆栈。 - trace_map - 事件ID到其子事件的映射关系。 在事件开始时,会使用trace_stack底部的事件作为trace_map中的当前父事件。 - trace_id - 当前追踪的简单名称,通常表示入口点(如query查询、index_construction索引构建、insert插入等)

参数: handlers (List[BaseCallbackHandler]): 要使用的处理器列表。

用法: with callback_manager.event(CBEventType.QUERY) as event: event.on_start(payload={key, val}) ... event.on_end(payload={key, val})

<dynamic>
Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
class MultiModalLLM(ChainableMixin, BaseComponent, DispatcherSpanMixin):
    """Multi-Modal LLM interface."""

    model_config = ConfigDict(arbitrary_types_allowed=True)
    callback_manager: CallbackManager = Field(
        default_factory=CallbackManager, exclude=True
    )

    def __init__(self, *args: Any, **kwargs: Any) -> None:
        # Help static checkers understand this class hierarchy
        super().__init__(*args, **kwargs)

    @property
    @abstractmethod
    def metadata(self) -> MultiModalLLMMetadata:
        """Multi-Modal LLM metadata."""

    @abstractmethod
    def complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponse:
        """Completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    def stream_complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponseGen:
        """Streaming completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    def chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponse:
        """Chat endpoint for Multi-Modal LLM."""

    @abstractmethod
    def stream_chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponseGen:
        """Stream chat endpoint for Multi-Modal LLM."""

    # ===== Async Endpoints =====

    @abstractmethod
    async def acomplete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponse:
        """Async completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def astream_complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponseAsyncGen:
        """Async streaming completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def achat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponse:
        """Async chat endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def astream_chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponseAsyncGen:
        """Async streaming chat endpoint for Multi-Modal LLM."""

    def _as_query_component(self, **kwargs: Any) -> QueryComponent:
        """Return query component."""
        if self.metadata.is_chat_model:
            # TODO: we don't have a separate chat component
            return MultiModalCompleteComponent(multi_modal_llm=self, **kwargs)
        else:
            return MultiModalCompleteComponent(multi_modal_llm=self, **kwargs)

    def __init_subclass__(cls, **kwargs: Any) -> None:
        """
        The callback decorators installs events, so they must be applied before
        the span decorators, otherwise the spans wouldn't contain the events.
        """
        for attr in (
            "complete",
            "acomplete",
            "stream_complete",
            "astream_complete",
            "chat",
            "achat",
            "stream_chat",
            "astream_chat",
        ):
            if callable(method := cls.__dict__.get(attr)):
                if attr.endswith("chat"):
                    setattr(cls, attr, llm_chat_callback()(method))
                else:
                    setattr(cls, attr, llm_completion_callback()(method))
        super().__init_subclass__(**kwargs)

元数据 abstractmethod property #

多模态LLM元数据。

完成 abstractmethod #

complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponse

多模态LLM的完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
 98
 99
100
101
102
@abstractmethod
def complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponse:
    """Completion endpoint for Multi-Modal LLM."""

stream_complete abstractmethod #

stream_complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponseGen

多模态LLM的流式完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
104
105
106
107
108
@abstractmethod
def stream_complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponseGen:
    """Streaming completion endpoint for Multi-Modal LLM."""

聊天 abstractmethod #

chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

多模态大语言模型的聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
110
111
112
113
114
115
116
@abstractmethod
def chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponse:
    """Chat endpoint for Multi-Modal LLM."""

stream_chat abstractmethod #

stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseGen

多模态LLM的流式聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
118
119
120
121
122
123
124
@abstractmethod
def stream_chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponseGen:
    """Stream chat endpoint for Multi-Modal LLM."""

acomplete abstractmethod async #

acomplete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponse

多模态LLM的异步完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
128
129
130
131
132
@abstractmethod
async def acomplete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponse:
    """Async completion endpoint for Multi-Modal LLM."""

astream_complete abstractmethod async #

astream_complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponseAsyncGen

多模态LLM的异步流式完成端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
134
135
136
137
138
@abstractmethod
async def astream_complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponseAsyncGen:
    """Async streaming completion endpoint for Multi-Modal LLM."""

聊天 abstractmethod async #

achat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

多模态LLM的异步聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
140
141
142
143
144
145
146
@abstractmethod
async def achat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponse:
    """Async chat endpoint for Multi-Modal LLM."""

astream_chat abstractmethod async #

astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseAsyncGen

多模态LLM的异步流式聊天端点。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
148
149
150
151
152
153
154
@abstractmethod
async def astream_chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponseAsyncGen:
    """Async streaming chat endpoint for Multi-Modal LLM."""

BaseMultiModalComponent #

基类: QueryComponent

基础LLM组件。

参数:

名称 类型 描述 默认值
multi_modal_llm MultiModalLLM

LLM

required
streaming bool

流式模式

False
Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
187
188
189
190
191
192
193
194
195
class BaseMultiModalComponent(QueryComponent):
    """Base LLM component."""

    model_config = ConfigDict(arbitrary_types_allowed=True)
    multi_modal_llm: MultiModalLLM = Field(..., description="LLM")
    streaming: bool = Field(default=False, description="Streaming mode")

    def set_callback_manager(self, callback_manager: Any) -> None:
        """Set callback manager."""

set_callback_manager #

set_callback_manager(callback_manager: Any) -> None

设置回调管理器。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
194
195
def set_callback_manager(self, callback_manager: Any) -> None:
    """Set callback manager."""

多模态完整组件 #

基类: BaseMultiModalComponent

多模态补全组件。

Source code in llama-index-core/llama_index/core/multi_modal_llms/base.py
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
class MultiModalCompleteComponent(BaseMultiModalComponent):
    """Multi-modal completion component."""

    def _validate_component_inputs(self, input: Dict[str, Any]) -> Dict[str, Any]:
        """Validate component inputs during run_component."""
        if "prompt" not in input:
            raise ValueError("Prompt must be in input dict.")

        # do special check to see if prompt is a list of chat messages
        if isinstance(input["prompt"], get_args(List[ChatMessage])):
            raise NotImplementedError(
                "Chat messages not yet supported as input to multi-modal model."
            )
        else:
            input["prompt"] = validate_and_convert_stringable(input["prompt"])

        # make sure image documents are valid
        if "image_documents" in input:
            if not isinstance(input["image_documents"], list):
                raise ValueError("image_documents must be a list.")
            for doc in input["image_documents"]:
                if not isinstance(doc, (ImageDocument, ImageNode)):
                    raise ValueError(
                        "image_documents must be a list of ImageNode objects."
                    )

        return input

    def _run_component(self, **kwargs: Any) -> Any:
        """Run component."""
        # TODO: support only complete for now
        prompt = kwargs["prompt"]
        image_documents = kwargs.get("image_documents", [])

        response: Any
        if self.streaming:
            response = self.multi_modal_llm.stream_complete(prompt, image_documents)
        else:
            response = self.multi_modal_llm.complete(prompt, image_documents)
        return {"output": response}

    async def _arun_component(self, **kwargs: Any) -> Any:
        """Run component."""
        # TODO: support only complete for now
        # non-trivial to figure how to support chat/complete/etc.
        prompt = kwargs["prompt"]
        image_documents = kwargs.get("image_documents", [])

        response: Any
        if self.streaming:
            response = await self.multi_modal_llm.astream_complete(
                prompt, image_documents
            )
        else:
            response = await self.multi_modal_llm.acomplete(prompt, image_documents)
        return {"output": response}

    @property
    def input_keys(self) -> InputKeys:
        """Input keys."""
        # TODO: support only complete for now
        return InputKeys.from_keys({"prompt", "image_documents"})

    @property
    def output_keys(self) -> OutputKeys:
        """Output keys."""
        return OutputKeys.from_keys({"output"})

输入键 property #

input_keys: InputKeys

输入键。

输出键 property #

output_keys: OutputKeys

输出键。