如何对聊天模型进行微调

本笔记本提供了我们新的 gpt-3.5-turbo 微调的逐步指南。我们将使用 RecipeNLG 数据集进行实体提取，该数据集提供了各种食谱以及每个食谱中提取的通用成分列表。这是一个常见的命名实体识别（NER）任务数据集。

我们将按照以下步骤进行：

设置： 加载我们的数据集并筛选到一个领域以进行微调。
数据准备： 通过创建训练和验证示例来准备数据进行微调，并将其上传到 Files 端点。
微调： 创建您的微调模型。
推理： 使用您的微调模型对新输入进行推理。

通过本文，您应该能够训练、评估和部署一个经过微调的 gpt-3.5-turbo 模型。

有关微调的更多信息，您可以参考我们的文档指南，API 参考或博客文章。

设置

# 确保使用最新版本的 OpenAI Python 包。
!pip install --upgrade openai 

import json
import openai
import os
import pandas as pd
from pprint import pprint

client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as env var>"))

微调在专注于特定领域时效果最好。确保数据集既要足够专注让模型学习，又要足够通用以避免遗漏未见过的示例是很重要的。考虑到这一点，我们从RecipesNLG数据集中提取了一个子集，只包含来自www.cookbooks.com的文档。

# Read in the dataset we'll use for this task.
# This will be the RecipesNLG dataset, which we've cleaned to only contain documents from www.cookbooks.com
recipe_df = pd.read_csv("data/cookbook_recipes_nlg_10k.csv")

recipe_df.head()

	title	ingredients	directions	link	source	NER
0	No-Bake Nut Cookies	["1 c. firmly packed brown sugar", "1/2 c. eva...	["In a heavy 2-quart saucepan, mix brown sugar...	www.cookbooks.com/Recipe-Details.aspx?id=44874	www.cookbooks.com	["brown sugar", "milk", "vanilla", "nuts", "bu...
1	Jewell Ball'S Chicken	["1 small jar chipped beef, cut up", "4 boned ...	["Place chipped beef on bottom of baking dish....	www.cookbooks.com/Recipe-Details.aspx?id=699419	www.cookbooks.com	["beef", "chicken breasts", "cream of mushroom...
2	Creamy Corn	["2 (16 oz.) pkg. frozen corn", "1 (8 oz.) pkg...	["In a slow cooker, combine all ingredients. C...	www.cookbooks.com/Recipe-Details.aspx?id=10570	www.cookbooks.com	["frozen corn", "cream cheese", "butter", "gar...
3	Chicken Funny	["1 large whole chicken", "2 (10 1/2 oz.) cans...	["Boil and debone chicken.", "Put bite size pi...	www.cookbooks.com/Recipe-Details.aspx?id=897570	www.cookbooks.com	["chicken", "chicken gravy", "cream of mushroo...
4	Reeses Cups(Candy)	["1 c. peanut butter", "3/4 c. graham cracker ...	["Combine first four ingredients and press in ...	www.cookbooks.com/Recipe-Details.aspx?id=659239	www.cookbooks.com	["peanut butter", "graham cracker crumbs", "bu...

数据准备

我们将从准备数据开始。在使用ChatCompletion格式进行微调时，每个训练样本都是一个简单的messages列表。例如，一个条目可能如下所示：

[{'role': 'system',
  'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'},

 {'role': 'user',
  'content': 'Title: No-Bake Nut Cookies\n\nIngredients: ["1 c. firmly packed brown sugar", "1/2 c. evaporated milk", "1/2 tsp. vanilla", "1/2 c. broken nuts (pecans)", "2 Tbsp. butter or margarine", "3 1/2 c. bite size shredded rice biscuits"]\n\nGeneric ingredients: '},

 {'role': 'assistant',
  'content': '["brown sugar", "milk", "vanilla", "nuts", "butter", "bite size shredded rice biscuits"]'}]

在训练过程中，这段对话将被拆分，最后一个条目将是模型生成的completion，而messages的其余部分将充当提示。在构建训练样本时请考虑这一点 - 如果您的模型将在多轮对话中运行，请提供代表性的示例，以免在对话开始扩展时表现不佳。

请注意，目前每个训练样本的标记限制为4096个。超过这个长度的内容将被截断为4096个标记。

training_data = []

system_message = "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."

def create_user_message(row):
    return f"""标题：{row['title']}

成分：{row['ingredients']}

通用成分： """

def prepare_example_conversation(row):
    messages = []
    messages.append({"role": "system", "content": system_message})

    user_message = create_user_message(row)
    messages.append({"role": "user", "content": user_message})

    messages.append({"role": "assistant", "content": row["NER"]})

    return {"messages": messages}

pprint(prepare_example_conversation(recipe_df.iloc[0]))

{'messages': [{'content': 'You are a helpful recipe assistant. You are to '
                          'extract the generic ingredients from each of the '
                          'recipes provided.',
               'role': 'system'},
              {'content': 'Title: No-Bake Nut Cookies\n'
                          '\n'
                          'Ingredients: ["1 c. firmly packed brown sugar", '
                          '"1/2 c. evaporated milk", "1/2 tsp. vanilla", "1/2 '
                          'c. broken nuts (pecans)", "2 Tbsp. butter or '
                          'margarine", "3 1/2 c. bite size shredded rice '
                          'biscuits"]\n'
                          '\n'
                          'Generic ingredients: ',
               'role': 'user'},
              {'content': '["brown sugar", "milk", "vanilla", "nuts", '
                          '"butter", "bite size shredded rice biscuits"]',
               'role': 'assistant'}]}

现在让我们为数据集的一个子集进行此操作，以用作我们的训练数据。您可以从30-50个精简的示例开始。随着训练集大小的增加，您应该会看到性能继续呈线性扩展，但您的作业也会花费更长的时间。

# 使用数据集的前100行进行训练
training_df = recipe_df.loc[0:100]

# 对training_df中的每一行应用prepare_example_conversation函数
training_data = training_df.apply(prepare_example_conversation, axis=1).tolist()

for example in training_data[:5]:
    print(example)

{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: No-Bake Nut Cookies\n\nIngredients: ["1 c. firmly packed brown sugar", "1/2 c. evaporated milk", "1/2 tsp. vanilla", "1/2 c. broken nuts (pecans)", "2 Tbsp. butter or margarine", "3 1/2 c. bite size shredded rice biscuits"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["brown sugar", "milk", "vanilla", "nuts", "butter", "bite size shredded rice biscuits"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Jewell Ball\'S Chicken\n\nIngredients: ["1 small jar chipped beef, cut up", "4 boned chicken breasts", "1 can cream of mushroom soup", "1 carton sour cream"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["beef", "chicken breasts", "cream of mushroom soup", "sour cream"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Creamy Corn\n\nIngredients: ["2 (16 oz.) pkg. frozen corn", "1 (8 oz.) pkg. cream cheese, cubed", "1/3 c. butter, cubed", "1/2 tsp. garlic powder", "1/2 tsp. salt", "1/4 tsp. pepper"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["frozen corn", "cream cheese", "butter", "garlic powder", "salt", "pepper"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Chicken Funny\n\nIngredients: ["1 large whole chicken", "2 (10 1/2 oz.) cans chicken gravy", "1 (10 1/2 oz.) can cream of mushroom soup", "1 (6 oz.) box Stove Top stuffing", "4 oz. shredded cheese"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["chicken", "chicken gravy", "cream of mushroom soup", "shredded cheese"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Reeses Cups(Candy)  \n\nIngredients: ["1 c. peanut butter", "3/4 c. graham cracker crumbs", "1 c. melted butter", "1 lb. (3 1/2 c.) powdered sugar", "1 large pkg. chocolate chips"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["peanut butter", "graham cracker crumbs", "butter", "powdered sugar", "chocolate chips"]'}]}

除了训练数据外，我们还可以可选地提供验证数据，用于确保模型不会过度拟合训练集。

validation_df = recipe_df.loc[101:200]
validation_data = validation_df.apply(prepare_example_conversation, axis=1).tolist()

然后，我们需要将我们的数据保存为.jsonl文件，每一行是一个训练示例对话。

def write_jsonl(data_list: list, filename: str) -> None:
    with open(filename, "w") as out:
        for ddict in data_list:
            jout = json.dumps(ddict) + "\n"
            out.write(jout)

training_file_name = "tmp_recipe_finetune_training.jsonl"
write_jsonl(training_data, training_file_name)

validation_file_name = "tmp_recipe_finetune_validation.jsonl"
write_jsonl(validation_data, validation_file_name)

这是我们训练.jsonl文件的前5行内容：

# 打印训练文件的前5行
!head -n 5 tmp_recipe_finetune_training.jsonl

{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: No-Bake Nut Cookies\n\nIngredients: [\"1 c. firmly packed brown sugar\", \"1/2 c. evaporated milk\", \"1/2 tsp. vanilla\", \"1/2 c. broken nuts (pecans)\", \"2 Tbsp. butter or margarine\", \"3 1/2 c. bite size shredded rice biscuits\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"brown sugar\", \"milk\", \"vanilla\", \"nuts\", \"butter\", \"bite size shredded rice biscuits\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Jewell Ball'S Chicken\n\nIngredients: [\"1 small jar chipped beef, cut up\", \"4 boned chicken breasts\", \"1 can cream of mushroom soup\", \"1 carton sour cream\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"beef\", \"chicken breasts\", \"cream of mushroom soup\", \"sour cream\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Creamy Corn\n\nIngredients: [\"2 (16 oz.) pkg. frozen corn\", \"1 (8 oz.) pkg. cream cheese, cubed\", \"1/3 c. butter, cubed\", \"1/2 tsp. garlic powder\", \"1/2 tsp. salt\", \"1/4 tsp. pepper\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"frozen corn\", \"cream cheese\", \"butter\", \"garlic powder\", \"salt\", \"pepper\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Chicken Funny\n\nIngredients: [\"1 large whole chicken\", \"2 (10 1/2 oz.) cans chicken gravy\", \"1 (10 1/2 oz.) can cream of mushroom soup\", \"1 (6 oz.) box Stove Top stuffing\", \"4 oz. shredded cheese\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"chicken\", \"chicken gravy\", \"cream of mushroom soup\", \"shredded cheese\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Reeses Cups(Candy)  \n\nIngredients: [\"1 c. peanut butter\", \"3/4 c. graham cracker crumbs\", \"1 c. melted butter\", \"1 lb. (3 1/2 c.) powdered sugar\", \"1 large pkg. chocolate chips\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"peanut butter\", \"graham cracker crumbs\", \"butter\", \"powdered sugar\", \"chocolate chips\"]"}]}

上传文件

您现在可以将文件上传到我们的Files端点，以供精调模型使用。

with open(training_file_name, "rb") as training_fd:
    training_response = client.files.create(
        file=training_fd, purpose="fine-tune"
    )

training_file_id = training_response.id

with open(validation_file_name, "rb") as validation_fd:
    validation_response = client.files.create(
        file=validation_fd, purpose="fine-tune"
    )
validation_file_id = validation_response.id

print("Training file ID:", training_file_id)
print("Validation file ID:", validation_file_id)

Training file ID: file-PVkEstNM2WWd1OQe3Hp3tC5E
Validation file ID: file-WSdTwLYrKxNhKi1WWGjxXi87

微调

现在我们可以使用生成的文件和一个可选的后缀来创建我们的微调作业，以便识别模型。响应将包含一个 id，您可以使用该 id 来获取有关作业的更新。

注意：文件必须首先被我们的系统处理，所以您可能会收到 File not ready 错误。在这种情况下，只需稍后几分钟后重试。

response = client.fine_tuning.jobs.create(
    training_file=training_file_id,
    validation_file=validation_file_id,
    model="gpt-3.5-turbo",
    suffix="recipe-ner",
)

job_id = response.id

print("Job ID:", response.id)
print("Status:", response.status)

Job ID: ftjob-bIVrnhnZEEizSP7rqWsRwv2R
Status: validating_files

检查作业状态

您可以向 https://api.openai.com/v1/alpha/fine-tunes 端点发出 GET 请求，以列出您的alpha微调作业。在这种情况下，您将希望检查从上一步获得的ID是否最终显示为 status: succeeded。

一旦完成，您可以使用 result_files 从验证集中对结果进行抽样（如果您上传了验证集），并使用 fine_tuned_model 参数中的ID来调用您训练过的模型。

response = client.fine_tuning.jobs.retrieve(job_id)

print("Job ID:", response.id)
print("Status:", response.status)
print("Trained Tokens:", response.trained_tokens)

Job ID: ftjob-bIVrnhnZEEizSP7rqWsRwv2R
Status: running
Trained Tokens: None

我们可以通过事件终端跟踪微调的进度。您可以多次重新运行下面的单元格，直到微调准备就绪。

response = client.fine_tuning.jobs.list_events(job_id)

events = response.data
events.reverse()

for event in events:
    print(event.message)

Step 131/303: training loss=0.25, validation loss=0.37
Step 141/303: training loss=0.00, validation loss=0.19
Step 151/303: training loss=0.00, validation loss=0.11
Step 161/303: training loss=0.00, validation loss=0.06
Step 171/303: training loss=0.10, validation loss=0.00
Step 181/303: training loss=0.00, validation loss=0.38
Step 191/303: training loss=0.00, validation loss=0.15
Step 201/303: training loss=0.06, validation loss=0.64
Step 211/303: training loss=0.00, validation loss=0.04
Step 221/303: training loss=0.59, validation loss=0.85
Step 231/303: training loss=0.00, validation loss=0.00
Step 241/303: training loss=0.04, validation loss=0.42
Step 251/303: training loss=0.00, validation loss=0.14
Step 261/303: training loss=0.00, validation loss=0.00
Step 271/303: training loss=0.15, validation loss=0.50
Step 281/303: training loss=0.00, validation loss=0.72
Step 291/303: training loss=0.08, validation loss=0.16
Step 301/303: training loss=0.00, validation loss=1.76
New fine-tuned model created: ft:gpt-3.5-turbo-0613:personal:recipe-ner:8PjmcwDH
The job has successfully completed

现在已经完成了，我们可以从作业中获取一个经过微调的模型ID：

response = client.fine_tuning.jobs.retrieve(job_id)
fine_tuned_model_id = response.fine_tuned_model

if fine_tuned_model_id is None: 
    raise RuntimeError("Fine-tuned model ID not found. Your job has likely not been completed yet.")

print("Fine-tuned model ID:", fine_tuned_model_id)

Fine-tuned model ID: ft:gpt-3.5-turbo-0613:personal:recipe-ner:8PjmcwDH

推断

最后一步是使用你微调过的模型进行推理。与经典的FineTuning类似，你只需调用ChatCompletions，并填入你的新微调模型名称作为model参数。

test_df = recipe_df.loc[201:300]
test_row = test_df.iloc[0]
test_messages = []
test_messages.append({"role": "system", "content": system_message})
user_message = create_user_message(test_row)
test_messages.append({"role": "user", "content": user_message})

pprint(test_messages)

[{'content': 'You are a helpful recipe assistant. You are to extract the '
             'generic ingredients from each of the recipes provided.',
  'role': 'system'},
 {'content': 'Title: Beef Brisket\n'
             '\n'
             'Ingredients: ["4 lb. beef brisket", "1 c. catsup", "1 c. water", '
             '"1/2 onion, minced", "2 Tbsp. cider vinegar", "1 Tbsp. prepared '
             'horseradish", "1 Tbsp. prepared mustard", "1 tsp. salt", "1/2 '
             'tsp. pepper"]\n'
             '\n'
             'Generic ingredients: ',
  'role': 'user'}]

response = client.chat.completions.create(
    model=fine_tuned_model_id, messages=test_messages, temperature=0, max_tokens=500
)
print(response.choices[0].message.content)

["beef brisket", "catsup", "water", "onion", "cider vinegar", "horseradish", "mustard", "salt", "pepper"]

设置​

数据准备​

上传文件​

微调​

检查作业状态​

推断​

结论​

设置

数据准备

上传文件

微调

检查作业状态

推断

结论