如何格式化输入到ChatGPT模型

ChatGPT由gpt-3.5-turbo和gpt-4提供支持，这是OpenAI最先进的模型。

您可以使用OpenAI API构建自己的应用程序，使用gpt-3.5-turbo或gpt-4。

聊天模型将一系列消息作为输入，并返回一个由AI编写的消息作为输出。

本指南通过几个示例API调用说明了聊天格式。

1. 导入openai库

# 如有需要，请安装并/或升级至最新版本的 OpenAI Python 库。
%pip install --upgrade openai

# 导入用于调用OpenAI API的OpenAI Python库
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as env var>"))

2. 一个示例聊天完成API调用

一个聊天完成API调用的参数， 必需的 - model: 您想要使用的模型的名称（例如，gpt-3.5-turbo，gpt-4，gpt-3.5-turbo-16k-1106） - messages: 一个消息对象的列表，其中每个对象有两个必需字段： - role: 传话者的角色（可以是 system，user，assistant 或 tool） - content: 消息的内容（例如，写一首美丽的诗给我）

消息也可以包含一个可选的 name 字段，为传话者指定一个名称。例如，example-user，Alice，BlackbeardBot。名称不能包含空格。

可选的 - frequency_penalty: 根据词频对标记进行惩罚，减少重复。 - logit_bias: 使用偏置值修改指定标记的可能性。 - logprobs: 如果为真，则返回输出标记的对数概率。 - top_logprobs: 指定在每个位置返回的最有可能的标记数。 - max_tokens: 设置在聊天完成中生成的标记的最大数量。 - n: 为每个输入生成指定数量的聊天完成选择。 - presence_penalty: 根据文本中的存在对新标记进行惩罚。 - response_format: 指定输出格式，例如 JSON 模式。 - seed: 使用指定的种子确保确定性抽样。 - stop: 指定 API 应停止生成标记的最多 4 个序列。 - stream: 当标记可用时发送部分消息增量。 - temperature: 在 0 到 2 之间设置抽样温度。 - top_p: 使用核采样；考虑具有 top_p 概率质量的标记。 - tools: 列出模型可能调用的函数。 - tool_choice: 控制模型的函数调用（none/auto/function）。 - user: 用于最终用户监控和滥用检测的唯一标识符。

截至 2024 年 1 月，您还可以选择提交一个告诉 GPT 是否可以生成 JSON 以供输入到函数的 functions 列表。有关详细信息，请参阅文档，API 参考，或者 Cookbook 指南如何使用聊天模型调用函数。

通常，对话将以一个告诉助手如何行为的系统消息开始，然后是交替的用户和助手消息，但您不必遵循这种格式。

让我们看一个示例聊天API调用，看看聊天格式在实践中是如何工作的。

# 示例：OpenAI Python库请求
MODEL = "gpt-3.5-turbo"
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Knock knock."},
        {"role": "assistant", "content": "Who's there?"},
        {"role": "user", "content": "Orange."},
    ],
    temperature=0,
)

print(json.dumps(json.loads(response.model_dump_json()), indent=4))

{
    "id": "chatcmpl-8dee9DuEFcg2QILtT2a6EBXZnpirM",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null,
            "message": {
                "content": "Orange who?",
                "role": "assistant",
                "function_call": null,
                "tool_calls": null
            }
        }
    ],
    "created": 1704461729,
    "model": "gpt-3.5-turbo-0613",
    "object": "chat.completion",
    "system_fingerprint": null,
    "usage": {
        "completion_tokens": 3,
        "prompt_tokens": 35,
        "total_tokens": 38
    }
}

正如您所看到的，响应对象具有一些字段： - id：请求的ID - choices：完成对象的列表（只有一个，除非您将n设置为大于1） - finish_reason：模型停止生成文本的原因（如果达到max_tokens限制，则为stop或length） - index：选择在选择列表中的索引。 - logprobs：选择的对数概率信息。 - message：模型生成的消息对象 - content：消息的内容 - role：此消息作者的角色。 - tool_calls：模型生成的工具调用，例如函数调用。如果给出了工具 - created：请求的时间戳 - model：用于生成响应的模型的完整名称 - object：返回的对象类型（例如，chat.completion） - system_fingerprint：此指纹表示模型运行的后端配置。 - usage：用于生成回复的令牌数量，包括提示、完成和总计。

提取只有回复的部分：

response.choices[0].message.content

'Orange who?'

即使是非基于对话的任务也可以适应聊天格式，方法是将指令放在第一个用户消息中。

例如，要求模型以海盗黑胡子的风格解释异步编程，我们可以按以下方式构建对话：

# 带有系统消息的示例
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain asynchronous programming in the style of the pirate Blackbeard."},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

Arr, me matey! Let me tell ye a tale of asynchronous programming, in the style of the fearsome pirate Blackbeard!

Picture this, me hearties. In the vast ocean of programming, there be times when ye need to perform multiple tasks at once. But fear not, for asynchronous programming be here to save the day!

Ye see, in traditional programming, ye be waitin' for one task to be done before movin' on to the next. But with asynchronous programming, ye can be takin' care of multiple tasks at the same time, just like a pirate multitaskin' on the high seas!

Instead of waitin' for a task to be completed, ye can be sendin' it off on its own journey, while ye move on to the next task. It be like havin' a crew of trusty sailors, each takin' care of their own duties, without waitin' for the others.

Now, ye may be wonderin', how does this sorcery work? Well, me matey, it be all about callbacks and promises. When ye be sendin' off a task, ye be attachin' a callback function to it. This be like leavin' a message in a bottle, tellin' the task what to do when it be finished.

While the task be sailin' on its own, ye can be movin' on to the next task, without wastin' any precious time. And when the first task be done, it be sendin' a signal back to ye, lettin' ye know it be finished. Then ye can be takin' care of the callback function, like openin' the bottle and readin' the message inside.

But wait, there be more! With promises, ye can be makin' even fancier arrangements. Instead of callbacks, ye be makin' a promise that the task will be completed. It be like a contract between ye and the task, swearin' that it will be done.

Ye can be attachin' multiple promises to a task, promisin' different outcomes. And when the task be finished, it be fulfillin' the promises, lettin' ye know it be done. Then ye can be handlin' the fulfillments, like collectin' the rewards of yer pirate adventures!

So, me hearties, that be the tale of asynchronous programming, told in the style of the fearsome pirate Blackbeard! With callbacks and promises, ye can be takin' care of multiple tasks at once, just like a pirate conquerin' the seven seas!

# 没有系统消息的示例
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "user", "content": "Explain asynchronous programming in the style of the pirate Blackbeard."},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

Arr, me hearties! Gather 'round and listen up, for I be tellin' ye about the mysterious art of asynchronous programming, in the style of the fearsome pirate Blackbeard!

Now, ye see, in the world of programming, there be times when we need to perform tasks that take a mighty long time to complete. These tasks might involve fetchin' data from the depths of the internet, or performin' complex calculations that would make even Davy Jones scratch his head.

In the olden days, we pirates used to wait patiently for each task to finish afore movin' on to the next one. But that be a waste of precious time, me hearties! We be pirates, always lookin' for ways to be more efficient and plunder more booty!

That be where asynchronous programming comes in, me mateys. It be a way to tackle multiple tasks at once, without waitin' for each one to finish afore movin' on. It be like havin' a crew of scallywags workin' on different tasks simultaneously, while ye be overseein' the whole operation.

Ye see, in asynchronous programming, we be breakin' down our tasks into smaller chunks called "coroutines." Each coroutine be like a separate pirate, workin' on its own task. When a coroutine be startin' its work, it don't wait for the task to finish afore movin' on to the next one. Instead, it be movin' on to the next task, lettin' the first one continue in the background.

Now, ye might be wonderin', "But Blackbeard, how be we know when a task be finished if we don't wait for it?" Ah, me hearties, that be where the magic of callbacks and promises come in!

When a coroutine be startin' its work, it be attachin' a callback or a promise to it. This be like leavin' a message in a bottle, tellin' the coroutine what to do when it be finished. So, while the coroutine be workin' away, the rest of the crew be movin' on to other tasks, plunderin' more booty along the way.

When a coroutine be finished with its task, it be sendin' a signal to the callback or fulfillin' the promise, lettin' the rest of the crew know that it be done. Then, the crew can gather 'round and handle the results of the completed task, celebratin' their victory and countin' their plunder.

So, me hearties, asynchronous programming be like havin' a crew of pirates workin' on different tasks at once, without waitin' for each one to finish afore movin' on. It be a way to be more efficient, plunder more booty, and conquer the vast seas of programming!

Now, set sail, me mateys, and embrace the power of asynchronous programming like true pirates of the digital realm! Arr!

3. 指导gpt-3.5-turbo-0301的技巧

指导模型的最佳实践可能会随着模型版本的变化而变化。以下建议适用于 gpt-3.5-turbo-0301，可能不适用于未来的模型。

系统消息

系统消息可用于为助手设定不同的个性或行为。

请注意，gpt-3.5-turbo-0301通常不像gpt-4-0314或gpt-3.5-turbo-0613那样关注系统消息。因此，对于gpt-3.5-turbo-0301，我们建议将重要的指令放在用户消息中。一些开发者发现，将系统消息不断移动到对话末尾可以保持模型的注意力不会在对话变长时漂移。

# 一个系统消息的例子，它引导助手深入解释概念
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a friendly and helpful teaching assistant. You explain concepts in great depth using simple terms, and you give examples to help people learn. At the end of each explanation, you ask a question to check for understanding"},
        {"role": "user", "content": "Can you explain how fractions work?"},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

Of course! Fractions are a way to represent parts of a whole. They are made up of two numbers: a numerator and a denominator. The numerator tells you how many parts you have, and the denominator tells you how many equal parts make up the whole.

Let's take an example to understand this better. Imagine you have a pizza that is divided into 8 equal slices. If you eat 3 slices, you can represent that as the fraction 3/8. Here, the numerator is 3 because you ate 3 slices, and the denominator is 8 because the whole pizza is divided into 8 slices.

Fractions can also be used to represent numbers less than 1. For example, if you eat half of a pizza, you can write it as 1/2. Here, the numerator is 1 because you ate one slice, and the denominator is 2 because the whole pizza is divided into 2 equal parts.

Now, let's talk about equivalent fractions. Equivalent fractions are different fractions that represent the same amount. For example, 1/2 and 2/4 are equivalent fractions because they both represent half of something. To find equivalent fractions, you can multiply or divide both the numerator and denominator by the same number.

Here's a question to check your understanding: If you have a cake divided into 12 equal slices and you eat 4 slices, what fraction of the cake did you eat?

# 一个系统消息的例子，它引导助手给出简明扼要的回答。
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a laconic assistant. You reply with brief, to-the-point answers with no elaboration."},
        {"role": "user", "content": "Can you explain how fractions work?"},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

Fractions represent parts of a whole. They have a numerator (top number) and a denominator (bottom number).

少样本提示

在某些情况下，向模型展示您想要的内容比告诉模型更容易。

向模型展示您想要的内容的一种方法是使用伪造的示例消息。

例如：

# 一个伪造的少量样本对话示例，旨在引导模型将商业术语转化为更简单的表达方式。
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful, pattern-following assistant."},
        {"role": "user", "content": "Help me translate the following corporate jargon into plain English."},
        {"role": "assistant", "content": "Sure, I'd be happy to!"},
        {"role": "user", "content": "New synergies will help drive top-line growth."},
        {"role": "assistant", "content": "Things working well together will increase revenue."},
        {"role": "user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
        {"role": "assistant", "content": "Let's talk later when we're less busy about how to do better."},
        {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

This sudden change in direction means we don't have enough time to complete the entire project for the client.

为了澄清示例消息并不是真实对话的一部分，也不应该由模型参考，您可以尝试将system消息的name字段设置为example_user和example_assistant。

对上面的few-shot示例进行转换，我们可以这样写：

# 商业术语翻译示例，但为示例消息提供了示例名称
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English."},
        {"role": "system", "name":"example_user", "content": "New synergies will help drive top-line growth."},
        {"role": "system", "name": "example_assistant", "content": "Things working well together will increase revenue."},
        {"role": "system", "name":"example_user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
        {"role": "system", "name": "example_assistant", "content": "Let's talk later when we're less busy about how to do better."},
        {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

This sudden change in direction means we don't have enough time to complete the entire project for the client.

并非每次尝试引导对话都会一开始就成功。

如果你的第一次尝试失败了，不要害怕尝试不同的方式来引导或调整模型。

举个例子，一位开发者发现，当他们插入了一条用户消息，内容是“到目前为止做得很棒，这些都很完美”，来帮助调整模型以提供更高质量的回复时，准确性有所提高。

想要了解更多提高模型可靠性的方法，请阅读我们的指南提高可靠性的技巧。该指南是针对非聊天模型编写的，但其中的许多原则仍然适用。

4. 计算标记

当您提交请求时，API会将消息转换为一系列标记。

使用的标记数量会影响： - 请求的成本 - 生成响应所需的时间 - 当回复因达到最大标记限制（对于gpt-3.5-turbo为4,096，对于gpt-4为8,192）而被截断时

您可以使用以下函数来计算一组消息将使用的标记数。

请注意，从消息中计算标记的确切方式可能会因模型而异。请将下面函数中的计数视为估计值，而不是永恒的保证。

特别是，使用可选函数输入的请求将消耗额外的标记，这些额外标记不包括在下面计算的估计值中。

在如何使用tiktoken计算标记中了解更多关于计算标记的信息。

import tiktoken


def num_tokens_from_messages(messages, model="gpt-3.5-turbo-0613"):
    """返回由消息列表使用的令牌数量。"""
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        print("Warning: model not found. Using cl100k_base encoding.")
        encoding = tiktoken.get_encoding("cl100k_base")
    if model in {
        "gpt-3.5-turbo-0613",
        "gpt-3.5-turbo-16k-0613",
        "gpt-4-0314",
        "gpt-4-32k-0314",
        "gpt-4-0613",
        "gpt-4-32k-0613",
        }:
        tokens_per_message = 3
        tokens_per_name = 1
    elif model == "gpt-3.5-turbo-0301":
        tokens_per_message = 4  # 每条消息都遵循以下格式：<|start|>{角色/名称}\n{内容}<|end|>\n
        tokens_per_name = -1  # 如果存在名称，则角色会被省略。
    elif "gpt-3.5-turbo" in model:
        print("Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0613.")
        return num_tokens_from_messages(messages, model="gpt-3.5-turbo-0613")
    elif "gpt-4" in model:
        print("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
        return num_tokens_from_messages(messages, model="gpt-4-0613")
    else:
        raise NotImplementedError(
            f"""num_tokens_from_messages() 方法尚未针对模型 {model} 实现。"""
        )
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3  # 每条回复都以<|start|>assistant<|message|>作为开头。
    return num_tokens

# let's verify the function above matches the OpenAI API response
example_messages = [
    {
        "role": "system",
        "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English.",
    },
    {
        "role": "system",
        "name": "example_user",
        "content": "New synergies will help drive top-line growth.",
    },
    {
        "role": "system",
        "name": "example_assistant",
        "content": "Things working well together will increase revenue.",
    },
    {
        "role": "system",
        "name": "example_user",
        "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage.",
    },
    {
        "role": "system",
        "name": "example_assistant",
        "content": "Let's talk later when we're less busy about how to do better.",
    },
    {
        "role": "user",
        "content": "This late pivot means we don't have time to boil the ocean for the client deliverable.",
    },
]

for model in [
    # "gpt-3.5-turbo-0301",
    # "gpt-4-0314",
    # "gpt-4-0613",
    "gpt-3.5-turbo-1106",
    "gpt-3.5-turbo",
    "gpt-4",
    "gpt-4-1106-preview",
    ]:
    print(model)
    # example token count from the function defined above
    print(f"{num_tokens_from_messages(example_messages, model)} prompt tokens counted by num_tokens_from_messages().")
    # example token count from the OpenAI API
    response = client.chat.completions.create(model=model,
    messages=example_messages,
    temperature=0,
    max_tokens=1)
    token = response.usage.prompt_tokens
    print(f'{token} prompt tokens counted by the OpenAI API.')
    print()

gpt-3.5-turbo-1106
Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0613.
129 prompt tokens counted by num_tokens_from_messages().
129 prompt tokens counted by the OpenAI API.

gpt-3.5-turbo
Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0613.
129 prompt tokens counted by num_tokens_from_messages().
129 prompt tokens counted by the OpenAI API.

gpt-4
Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.
129 prompt tokens counted by num_tokens_from_messages().
129 prompt tokens counted by the OpenAI API.

gpt-4-1106-preview
Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.
129 prompt tokens counted by num_tokens_from_messages().
129 prompt tokens counted by the OpenAI API.

1. 导入openai库​

2. 一个示例聊天完成API调用​

3. 指导gpt-3.5-turbo-0301的技巧​

系统消息​

少样本提示​

4. 计算标记​