聊天模型集成测试#

class langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests[source]#

聊天模型集成测试的基类。

测试子类必须实现 chat_model_class 和 chat_model_params 属性，以指定要测试的模型及其初始化参数。

示例：

from typing import Type

from langchain_tests.integration_tests import ChatModelIntegrationTests
from my_package.chat_models import MyChatModel


class TestMyChatModelIntegration(ChatModelIntegrationTests):
    @property
    def chat_model_class(self) -> Type[MyChatModel]:
        # Return the chat model class to test here
        return MyChatModel

    @property
    def chat_model_params(self) -> dict:
        # Return initialization parameters for the model.
        return {"model": "model-001", "temperature": 0}

注意

个别测试方法的API参考包括故障排除提示。

测试子类必须实现以下两个属性：

chat_model_class

要测试的聊天模型类，例如 ChatParrotLink。

示例：

@property
def chat_model_class(self) -> Type[ChatParrotLink]:
    return ChatParrotLink

chat_model_params

聊天模型的初始化参数。

示例：

@property
def chat_model_params(self) -> dict:
    return {"model": "bird-brain-001", "temperature": 0}

此外，测试子类可以通过选择性地覆盖以下属性来控制测试哪些功能（例如工具调用或多模态）。展开以查看详细信息：

属性

`chat_model_class`	要测试的聊天模型类，例如 `ChatParrotLink`。
`chat_model_params`	聊天模型的初始化参数。
`has_structured_output`	(bool) 聊天模型是否支持结构化输出。
`has_tool_calling`	(bool) 模型是否支持工具调用。
`returns_usage_metadata`	(bool) 聊天模型是否在调用和流式响应时返回使用元数据。
`supported_usage_metadata_details`	(字典) 在调用和流中发出的使用元数据详细信息。
`supports_anthropic_inputs`	(bool) 聊天模型是否支持Anthropic风格的输入。
`supports_image_inputs`	(bool) 聊天模型是否支持图像输入，默认为 `False`。
`supports_image_tool_message`	(bool) 聊天模型是否支持包含图像内容的ToolMessages。
`supports_json_mode`	(bool) 聊天模型是否支持JSON模式。
`supports_video_inputs`	(bool) 聊天模型是否支持视频输入，默认为 `False`。
`tool_choice_value`	(None 或 str) 用于测试时的工具选择。

方法

`test_abatch`(model)	测试以验证 await model.abatch([messages]) 是否有效。
`test_ainvoke`(model)	测试以验证 await model.ainvoke(simple_message) 是否有效。
`test_anthropic_inputs`(model)	测试模型是否可以处理Anthropic风格的消息历史。
`test_astream`(model)	测试以验证 await model.astream(simple_message) 是否有效。
`test_batch`(model)	测试以验证 model.batch([messages]) 是否有效。
`test_bind_runnables_as_tools`(model)	测试模型是否为从LangChain runnables派生的工具生成工具调用。
`test_conversation`(model)	测试以验证模型能够处理多轮对话。
`test_image_inputs`(model)	测试模型能够处理图像输入。
`test_image_tool_message`(model)	测试模型能否处理带有图像输入的ToolMessages。
`test_invoke`(model)	测试以验证 model.invoke(simple_message) 是否有效。
`test_json_mode`(model)	通过`JSON模式测试结构化输出。
`test_message_with_name`(model)	测试可以处理带有`name`字段值的HumanMessage。
`test_stop_sequence`(model)	测试当使用`stop`参数调用模型时，模型不会失败，该参数是用于在某个标记处停止生成的标准参数。
`test_stream`(model)	测试以验证 model.stream(simple_message) 是否有效。
`test_structured_few_shot_examples`(model, ...)	测试模型能否处理带有工具调用的少样本示例。
`test_structured_output`(model)	测试以验证在调用和流式传输时生成结构化输出。
`test_structured_output_async`(model)	测试以验证在调用和流式传输时都会生成结构化输出。
`test_structured_output_optional_param`(model)	测试以验证我们可以生成包含可选参数的结构化输出。
`test_structured_output_pydantic_2_v1`(model)	测试以验证我们可以使用 pydantic.v1.BaseModel 生成结构化输出。
`test_tool_calling`(model)	测试模型生成工具调用。
`test_tool_calling_async`(model)	测试模型生成工具调用。
`test_tool_calling_with_no_arguments`(model)	测试模型是否为没有参数的生成工具调用。
`test_tool_message_error_status`(model, ...)	测试可以处理带有`status="error"`的ToolMessage。
`test_tool_message_histories_list_content`(...)	测试消息历史记录是否与列表工具内容兼容（例如Anthropic格式）。
`test_tool_message_histories_string_content`(...)	测试消息历史记录是否与字符串工具内容兼容（例如OpenAI格式）。
`test_usage_metadata`(model)	测试以验证模型返回正确的使用元数据。
`test_usage_metadata_streaming`(model)	测试以验证模型在流模式下返回正确的使用元数据。

async test_abatch(model: BaseChatModel) → None[source]#

测试以验证await model.abatch([messages])是否有效。

这应该适用于所有集成。测试模型在单个批次中异步处理多个提示的能力。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

async test_ainvoke(model: BaseChatModel) → None[来源]#

测试以验证await model.ainvoke(simple_message)是否有效。

这应该适用于所有集成。通过此测试并不表示“原生异步”实现，而是表示该模型可以在异步上下文中使用。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_anthropic_inputs(model: BaseChatModel) → None[source]#

测试模型是否可以处理Anthropic风格的消息历史。

这些消息历史将包括带有tool_use内容块的AIMessage对象，例如，

AIMessage(
    [
        {"type": "text", "text": "Hmm let me think about that"},
        {
            "type": "tool_use",
            "input": {"fav_color": "green"},
            "id": "foo",
            "name": "color_picker",
        },
    ]
)

以及包含tool_result内容块的HumanMessage对象：

HumanMessage(
    [
        {
            "type": "tool_result",
            "tool_use_id": "foo",
            "content": [
                {
                    "type": "text",
                    "text": "green is a great pick! that's my sister's favorite color",  # noqa: E501
                }
            ],
            "is_error": False,
        },
        {"type": "text", "text": "what's my sister's favorite color"},
    ]
)

如果模型不支持这种形式的消息（或者通常不支持工具调用），则应跳过此测试。请参阅下面的配置。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

async test_astream(model: BaseChatModel) → None[源代码]#

测试以验证await model.astream(simple_message)是否有效。

这应该适用于所有集成。通过此测试并不表示“原生异步”或“流式”实现，而是表示该模型可以在异步流式上下文中使用。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_batch(model: BaseChatModel) → None[source]#

测试以验证model.batch([messages])是否有效。

这应该适用于所有集成。测试模型处理单个批次中多个提示的能力。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_bind_runnables_as_tools(model: BaseChatModel) → None[源代码]#

测试模型是否为从LangChain可运行对象派生的工具生成工具调用。如果测试类上的has_tool_calling属性设置为False，则跳过此测试。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_conversation(model: BaseChatModel) → None[源代码]#

测试以验证模型能否处理多轮对话。

这应该适用于所有集成。测试模型处理交替的人类和AI消息序列作为生成下一个响应的上下文的能力。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_image_inputs(model: BaseChatModel) → None[源代码]#

测试模型能够处理图像输入。

如果模型不支持图像输入，则应跳过此测试（请参阅下面的配置）。这些将以带有OpenAI风格图像内容块的消息形式出现：

[
    {"type": "text", "text": "describe the weather in this image"},
    {
        "type": "image_url",
        "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
    },
]

参见 https://python.langchain.com/docs/concepts/multimodality/

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_image_tool_message(model: BaseChatModel) → None[source]#

测试模型能够处理带有图像输入的ToolMessages。

如果模型不支持以下形式的消息，则应跳过此测试：

ToolMessage(
    content=[
        {
            "type": "image_url",
            "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
        },
    ],
    tool_call_id="1",
    name="random_image",
)

可以通过将supports_image_tool_message属性设置为False来跳过此测试（请参阅下面的配置）。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_invoke(model: BaseChatModel) → None[source]#

测试以验证model.invoke(simple_message)是否有效。

这应该适用于所有集成。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_json_mode(model: BaseChatModel) → None[源代码]#

通过JSON模式测试结构化输出。

此测试是可选的，如果模型不支持JSON模式功能（请参见下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_message_with_name(model: BaseChatModel) → None[源代码]#

测试可以处理带有name字段值的HumanMessage。

这些消息可能采取以下形式：

HumanMessage("hello", name="example_user")

如果可能，name 字段应该被解析并适当地传递给模型。否则，它应该被忽略。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_stop_sequence(model: BaseChatModel) → None[source]#

测试模型在调用stop参数时不会失败，这是一个用于在某个标记处停止生成的标准参数。

更多关于标准参数的信息在这里：https://python.langchain.com/docs/concepts/chat_models/#standard-parameters

这应该适用于所有集成。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_stream(model: BaseChatModel) → None[source]#

测试以验证model.stream(simple_message)是否有效。

这应该适用于所有集成。通过此测试并不表示“流式”实现，而是表示该模型可以在流式上下文中使用。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_structured_few_shot_examples(model: BaseChatModel, my_adder_tool: BaseTool) → None[source]#

测试模型是否能够处理带有工具调用的少样本示例。

这些表示为以下形式的消息序列：

HumanMessage 带有字符串内容;
AIMessage 带有填充的 tool_calls 属性；
ToolMessage 带有字符串内容；
AIMessage 包含字符串内容（一个答案）；
HuamnMessage 带有字符串内容（一个后续问题）。

如果模型不支持工具调用，则应跳过此测试（请参阅下面的配置）。

Parameters:

model (BaseChatModel)
my_adder_tool (BaseTool)

Return type:

无

test_structured_output(model: BaseChatModel) → None[source]#

测试以验证在调用和流式传输时是否生成了结构化输出。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

async test_structured_output_async(model: BaseChatModel) → None[source]#

测试以验证在调用和流式传输时是否生成了结构化输出。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_structured_output_optional_param(model: BaseChatModel) → None[source]#

测试以验证我们可以生成包含可选参数的结构化输出。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_structured_output_pydantic_2_v1(model: BaseChatModel) → None[source]#

测试验证我们可以使用 pydantic.v1.BaseModel 生成结构化输出。

pydantic.v1.BaseModel 在 pydantic 2 包中可用。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_tool_calling(model: BaseChatModel) → None[source]#

测试模型是否生成工具调用。如果测试类上的has_tool_calling属性设置为False，则跳过此测试。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

async test_tool_calling_async(model: BaseChatModel) → None[source]#

测试模型是否生成工具调用。如果测试类上的has_tool_calling属性设置为False，则跳过此测试。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_tool_calling_with_no_arguments(model: BaseChatModel) → None[source]#

测试模型是否为没有参数的生成工具调用。如果测试类上的has_tool_calling属性设置为False，则跳过此测试。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_tool_message_error_status(model: BaseChatModel, my_adder_tool: BaseTool) → None[source]#

测试可以处理带有status="error"的ToolMessage。

这些消息可能采取以下形式：

ToolMessage(
    "Error: Missing required argument 'b'.",
    name="my_adder_tool",
    tool_call_id="abc123",
    status="error",
)

如果可能，应该解析status字段并适当地传递给模型。

此测试是可选的，如果模型不支持工具调用（请参阅下面的配置），则应跳过。

Parameters:

model (BaseChatModel)
my_adder_tool (BaseTool)

Return type:

无

test_tool_message_histories_list_content(model: BaseChatModel, my_adder_tool: BaseTool) → None[source]#

测试消息历史记录是否与列表工具内容兼容（例如Anthropic格式）。

这些消息历史将包括带有“工具使用”和内容块的AIMessage对象，例如，

[
    {"type": "text", "text": "Hmm let me think about that"},
    {
        "type": "tool_use",
        "input": {"fav_color": "green"},
        "id": "foo",
        "name": "color_picker",
    },
]

如果模型不支持工具调用，则应跳过此测试（请参阅下面的配置）。

Parameters:

model (BaseChatModel)
my_adder_tool (BaseTool)

Return type:

无

test_tool_message_histories_string_content(model: BaseChatModel, my_adder_tool: BaseTool) → None[source]#

测试消息历史记录是否与字符串工具内容兼容（例如，OpenAI格式）。如果模型通过此测试，它应该与遵循OpenAI格式的提供者生成的消息兼容。

如果模型不支持工具调用，则应跳过此测试（请参阅下面的配置）。

Parameters:

model (BaseChatModel)
my_adder_tool (BaseTool)

Return type:

无

test_usage_metadata(model: BaseChatModel) → None[源代码]#

测试以验证模型返回正确的使用元数据。

此测试是可选的，如果模型未返回使用元数据（请参阅下面的配置），则应跳过。

Parameters:: 模型 (BaseChatModel)
Return type:: 无

test_usage_metadata_streaming(model: BaseChatModel) → None[source]#

测试以验证模型在流模式下返回正确的使用元数据。

Parameters:: 模型 (BaseChatModel)
Return type:: 无