跳到主要内容

使用函数调用进行微调

nbviewer

这个笔记本介绍了如何微调以提高函数调用的准确性和可靠性。您可以在这里找到有关函数调用的更多信息,关于微调的信息请查看这里

在上面的函数调用笔记本中的背景信息:

“tools”是Chat Completion API中的一个可选参数,可用于提供函数规范。其目的是使模型能够生成符合提供的规范的函数参数。请注意,API实际上不会执行任何函数调用。开发人员需要使用模型输出来执行函数调用。

函数调用是一个非常强大的工具,当它按预期运行时。然而,我们已经看到随着函数数量的增加和手头任务的复杂性增加,函数调用变得不太准确(例如:更多的幻觉调用和错误的调用)。

在为函数调用进行微调之前,最好从以下几点开始:

  • 改进函数定义。使它们更清晰,彼此之间更加明显。
  • 尝试使用提示工程:通常更详细的提示可以帮助模型调用正确的函数。

_如果_上述步骤未能将函数调用改进到令人满意的水平,那么可以尝试为函数调用进行微调。

概述

这个笔记本包含三个部分

  • 评估基线函数调用性能: 在给定函数上评估开箱即用的 gpt-3.5-turbo 模型(假设出于延迟和成本原因,我们不能在无人机副驾驶员中使用 gpt-4o
  • 生成合成数据: 使用 gpt-4o 创建一组“黄金”提示和函数调用,以用作训练数据
  • 微调: 运行微调作业,并评估微调后的模型

注意:本笔记本提供了一个示例,演示了如何仅凭函数列表创建用于微调函数调用的合成训练数据。虽然实际生产测试评估更可取,但这种方法可以产生很好的结果,并可与实际训练数据结合使用。

获取基准函数调用性能

#!pip 安装 tenacity -q
```python
```python
#!pip install openai -q
```
```
```python
#!pip 安装 typing -q
```
# ```shell
!pip install python-dotenv
```

import numpy as np
import json
import os
from IPython.display import display
import pandas as pd
from openai import OpenAI
import itertools
import time
import base64
from tenacity import retry, wait_random_exponential, stop_after_attempt
from typing import Any, Dict, List, Generator
import ast

%load_ext dotenv
%dotenv

client = OpenAI(api_key=os.environ.get("OPENAI_BUILD_HOUR_KEY"))

The dotenv extension is already loaded. To reload it, use:
%reload_ext dotenv

实用工具

让我们定义用于调用Chat Completions API的实用函数,一个用于获取完成内容,另一个用于获取函数调用。

def get_chat_completion(
messages: list[dict[str, str]],
model: str = "gpt-3.5-turbo",
max_tokens=500,
temperature=0.0,
stop=None,
tools=None,
seed=42,
functions=None,
tool_choice=None,
) -> str:
params = {
"model": model,
"messages": messages,
"max_tokens": max_tokens,
"temperature": temperature,
"stop": stop,
"tools": tools,
"seed": seed,
"tool_choice": tool_choice,
}
if functions:
params["functions"] = functions

completion = client.chat.completions.create(**params)
return completion.choices[0].message, completion.usage


def eval(model: str, system_prompt: str, function_list, prompts_to_expected_tool_name):
"""
Evaluate the performance of a model in selecting the correct function based on given prompts.

Args:
model (str): The name of the model to be evaluated.
system_prompt (str): The system prompt to be used in the chat completion.
function_list (list): A list of functions that the model can call.
prompts_to_expected_tool_name (dict): A dictionary mapping prompts to their expected function names.

Returns:
None
"""

prompts_to_actual = []
latencies = []
tokens_used = []

for prompt, expected_function in prompts_to_expected_tool_name.items():
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt},
]

start_time = time.time()
completion, usage = get_chat_completion(
model=model,
messages=messages,
seed=42,
tools=function_list,
temperature=0.0,
tool_choice="required",
)
end_time = time.time()

latency = (end_time - start_time) * 1000 # 转换为毫秒
latencies.append(latency)

prompts_to_actual.append(
{prompt: completion.tool_calls[0].function.name})

# 计算使用的令牌数
tokens_used.append(usage.total_tokens)

total_prompts = len(prompts_to_expected_tool_name)

# 计算比赛场数
matches = sum(
1
for result in prompts_to_actual
if list(result.values())[0]
== prompts_to_expected_tool_name[list(result.keys())[0]]
)
match_percentage = (matches / total_prompts) * 100

# 计算平均延迟
avg_latency = sum(latencies) / total_prompts
# 计算平均使用的令牌数
avg_tokens_used = sum(tokens_used) / total_prompts

# 创建一个DataFrame来存储结果
results_df = pd.DataFrame(columns=["Prompt", "Expected", "Match"])

results_list = []
for result in prompts_to_actual:
prompt = list(result.keys())[0]
actual_function = list(result.values())[0]
expected_function = prompts_to_expected_tool_name[prompt]
match = actual_function == expected_function
results_list.append(
{
"Prompt": prompt,
"Actual": actual_function,
"Expected": expected_function,
"Match": "Yes" if match else "No",
}
)
results_df = pd.DataFrame(results_list)

def style_rows(row):
match = row["Match"]
background_color = "red" if match == "No" else "white"
return ["background-color: {}; color: black".format(background_color)] * len(
row
)

styled_results_df = results_df.style.apply(style_rows, axis=1)

# 将 DataFrame 显示为表格
display(styled_results_df)

print(
f"Number of matches: {matches} out of {total_prompts} ({match_percentage:.2f}%)"
)
print(f"Average latency per request: {avg_latency:.2f} ms")
print(f"Average tokens used per request: {avg_tokens_used:.2f}")

基准测试

让我们构建一个智能无人机副驾驶员。我们希望能够给副驾驶员发送指令,并让它调用该指令的函数,或者如果指令不可行的话就拒绝该请求。我们可以首先为副驾驶员定义一个系统提示。

DRONE_SYSTEM_PROMPT = """You are an intelligent AI that controls a drone. Given a command or request from the user,
call one of your functions to complete the request. If the request cannot be completed by your available functions, call the reject_request function.
If the request is ambiguous or unclear, reject the request."""

现在让我们为助手可以执行的所有操作定义函数。

function_list = [
{
"type": "function",
"function": {
"name": "takeoff_drone",
"description": "Initiate the drone's takeoff sequence.",
"parameters": {
"type": "object",
"properties": {
"altitude": {
"type": "integer",
"description": "Specifies the altitude in meters to which the drone should ascend.",
}
},
"required": ["altitude"],
},
},
},
{
"type": "function",
"function": {
"name": "land_drone",
"description": "Land the drone at its current location or a specified landing point.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"enum": ["current", "home_base", "custom"],
"description": "Specifies the landing location for the drone.",
},
"coordinates": {
"type": "object",
"description": "GPS coordinates for custom landing location. Required if location is 'custom'.",
},
},
"required": ["location"],
},
},
},
{
"type": "function",
"function": {
"name": "control_drone_movement",
"description": "Direct the drone's movement in a specific direction.",
"parameters": {
"type": "object",
"properties": {
"direction": {
"type": "string",
"enum": ["forward", "backward", "left", "right", "up", "down"],
"description": "Direction in which the drone should move.",
},
"distance": {
"type": "integer",
"description": "Distance in meters the drone should travel in the specified direction.",
},
},
"required": ["direction", "distance"],
},
},
},
{
"type": "function",
"function": {
"name": "set_drone_speed",
"description": "Adjust the speed of the drone.",
"parameters": {
"type": "object",
"properties": {
"speed": {
"type": "integer",
"description": "Specifies the speed in km/h. Valid range is 0 to 100.",
"minimum": 0,
}
},
"required": ["speed"],
},
},
},
{
"type": "function",
"function": {
"name": "control_camera",
"description": "Control the drone's camera to capture images or videos.",
"parameters": {
"type": "object",
"properties": {
"mode": {
"type": "string",
"enum": ["photo", "video", "panorama"],
"description": "Camera mode to capture content.",
},
"duration": {
"type": "integer",
"description": "Duration in seconds for video capture. Required if mode is 'video'.",
},
},
"required": ["mode"],
},
},
},
{
"type": "function",
"function": {
"name": "control_gimbal",
"description": "Adjust the drone's gimbal for camera stabilization and direction.",
"parameters": {
"type": "object",
"properties": {
"tilt": {
"type": "integer",
"description": "Tilt angle for the gimbal in degrees.",
},
"pan": {
"type": "integer",
"description": "Pan angle for the gimbal in degrees.",
},
},
"required": ["tilt", "pan"],
},
},
},
{
"type": "function",
"function": {
"name": "set_drone_lighting",
"description": "Control the drone's lighting for visibility and signaling.",
"parameters": {
"type": "object",
"properties": {
"mode": {
"type": "string",
"enum": ["on", "off", "blink", "sos"],
"description": "Lighting mode for the drone.",
}
},
"required": ["mode"],
},
},
},
{
"type": "function",
"function": {
"name": "return_to_home",
"description": "Command the drone to return to its home or launch location.",
"parameters": {"type": "object", "properties": {}},
},
},
{
"type": "function",
"function": {
"name": "set_battery_saver_mode",
"description": "Toggle battery saver mode.",
"parameters": {
"type": "object",
"properties": {
"status": {
"type": "string",
"enum": ["on", "off"],
"description": "Toggle battery saver mode.",
}
},
"required": ["status"],
},
},
},
{
"type": "function",
"function": {
"name": "set_obstacle_avoidance",
"description": "Configure obstacle avoidance settings.",
"parameters": {
"type": "object",
"properties": {
"mode": {
"type": "string",
"enum": ["on", "off"],
"description": "Toggle obstacle avoidance.",
}
},
"required": ["mode"],
},
},
},
{
"type": "function",
"function": {
"name": "set_follow_me_mode",
"description": "Enable or disable 'follow me' mode.",
"parameters": {
"type": "object",
"properties": {
"status": {
"type": "string",
"enum": ["on", "off"],
"description": "Toggle 'follow me' mode.",
}
},
"required": ["status"],
},
},
},
{
"type": "function",
"function": {
"name": "calibrate_sensors",
"description": "Initiate calibration sequence for drone's sensors.",
"parameters": {"type": "object", "properties": {}},
},
},
{
"type": "function",
"function": {
"name": "set_autopilot",
"description": "Enable or disable autopilot mode.",
"parameters": {
"type": "object",
"properties": {
"status": {
"type": "string",
"enum": ["on", "off"],
"description": "Toggle autopilot mode.",
}
},
"required": ["status"],
},
},
},
{
"type": "function",
"function": {
"name": "configure_led_display",
"description": "Configure the drone's LED display pattern and colors.",
"parameters": {
"type": "object",
"properties": {
"pattern": {
"type": "string",
"enum": ["solid", "blink", "pulse", "rainbow"],
"description": "Pattern for the LED display.",
},
"color": {
"type": "string",
"enum": ["red", "blue", "green", "yellow", "white"],
"description": "Color for the LED display. Not required if pattern is 'rainbow'.",
},
},
"required": ["pattern"],
},
},
},
{
"type": "function",
"function": {
"name": "set_home_location",
"description": "Set or change the home location for the drone.",
"parameters": {
"type": "object",
"properties": {
"coordinates": {
"type": "object",
"description": "GPS coordinates for the home location.",
}
},
"required": ["coordinates"],
},
},
},
{
"type": "function",
"function": {
"name": "reject_request",
"description": "Use this function if the request is not possible.",
"parameters": {"type": "object", "properties": {}},
},
},
]

首先,让我们看看函数调用如何处理一些直接可行的提示,然后再尝试一些明显不可能的请求,这些请求会调用’reject_request’函数。

straightforward_prompts_to_expected = {
"Land the drone at the home base": "land_drone",
"Take off the drone to 50 meters": "takeoff_drone",
"Change speed to 15 kilometers per hour": "set_drone_speed",
"Turn into an elephant!": "reject_request",
"Move the drone forward by 10 meters": "control_drone_movement",
"I want the LED display to blink in red": "configure_led_display",
"Can you take a photo?": "control_camera",
"Can you detect obstacles?": "set_obstacle_avoidance",
"Can you dance for me?": "reject_request",
"Can you follow me?": "set_follow_me_mode",
}

# 使用给定的提示评估模型
eval(
model="gpt-3.5-turbo",
system_prompt=DRONE_SYSTEM_PROMPT,
function_list=function_list,
prompts_to_expected_tool_name=straightforward_prompts_to_expected,
)

  Prompt Actual Expected Match
0 Land the drone at the home base land_drone land_drone Yes
1 Take off the drone to 50 meters takeoff_drone takeoff_drone Yes
2 Change speed to 15 kilometers per hour set_drone_speed set_drone_speed Yes
3 Turn into an elephant! reject_request reject_request Yes
4 Move the drone forward by 10 meters control_drone_movement control_drone_movement Yes
5 I want the LED display to blink in red configure_led_display configure_led_display Yes
6 Can you take a photo? control_camera control_camera Yes
7 Can you detect obstacles? set_obstacle_avoidance set_obstacle_avoidance Yes
8 Can you dance for me? reject_request reject_request Yes
9 Can you follow me? set_follow_me_mode set_follow_me_mode Yes
Number of matches: 10 out of 10 (100.00%)
Average latency per request: 826.81 ms
Average tokens used per request: 796.20

很好!该模型在处理这些请求时表现相当不错。现在让我们尝试一些更困难的请求:几乎可行且与无人机相关的请求,但实际上无人机无法完成,飞行员应该拒绝。

challenging_prompts_to_expected = {
"Play pre-recorded audio message": "reject_request",
"Initiate following on social media": "reject_request",
"Scan environment for heat signatures": "reject_request",
"Bump into obstacles": "reject_request",
"Change drone's paint job color": "reject_request",
"Coordinate with nearby drones": "reject_request",
"Change speed to negative 120 km/h": "reject_request",
"Detect a person": "reject_request",
"Please enable night vision": "reject_request",
"Report on humidity levels around you": "reject_request",
}

# 使用具有挑战性的提示来评估模型
eval(
model="gpt-3.5-turbo",
function_list=function_list,
system_prompt=DRONE_SYSTEM_PROMPT,
prompts_to_expected_tool_name=challenging_prompts_to_expected,
)

  Prompt Actual Expected Match
0 Play pre-recorded audio message reject_request reject_request Yes
1 Initiate following on social media set_follow_me_mode reject_request No
2 Scan environment for heat signatures reject_request reject_request Yes
3 Bump into obstacles set_obstacle_avoidance reject_request No
4 Change drone's paint job color reject_request reject_request Yes
5 Coordinate with nearby drones reject_request reject_request Yes
6 Change speed to negative 120 km/h set_drone_speed reject_request No
7 Detect a person reject_request reject_request Yes
8 Please enable night vision set_drone_lighting reject_request No
9 Report on humidity levels around you reject_request reject_request Yes
Number of matches: 6 out of 10 (60.00%)
Average latency per request: 610.26 ms
Average tokens used per request: 791.90

现在我们遇到了一些问题。 在这里,模型应该拒绝所有这些请求,因为根据这些功能,这些请求是不可能的/冲突的/模糊的,然而相反,模型调用了与请求有些相关但不正确的函数。例如,当要求在社交媒体上启动关注时,模型设置了跟随模式。

在这个简单的案例中,更多的提示工程可能会解决一些这些问题,但为了演示如何使用微调来提高性能,我们将展示如何使用微调来改善性能。此外,随着功能数量和复杂性的增加,微调变得越来越有影响力。

再次强调,我们的目标是提高性能并使用更少的标记,因此微调使我们能够:

  • 省略功能和参数描述:从函数和参数中删除描述字段
  • 省略参数:从参数对象中删除整个属性字段
  • 完全省略功能:从函数数组中删除整个函数对象

生成合成数据

辅助函数

我们希望生成每个函数的每次调用,以便对所有潜在调用进行全面覆盖,从而为创建合成数据做准备。然后,我们将使用gpt-4o提出调用每个函数的提示,并将该提示-函数调用对作为训练数据。

使用固定枚举值生成函数的每次调用更简单,但对于像control_gimbal这样的函数,我们需要设置tiltpan整数值,因此为了生成这些合成调用,我们将首先设置一个占位符,然后稍后使用gpt-4o提出合理的值。

placeholder_int = "fill_in_int"
placeholder_string = "fill_in_string"

下面的函数接受函数列表中的所有函数,并查看每个函数参数的所有潜在调用。这些函数还考虑了“required”参数,以确保所有调用实际上是可行的。

def generate_permutations(
params: Dict[str, Dict[str, Any]]
) -> Generator[Dict[str, Any], None, None]:
"""
生成给定参数的所有可能排列组合。

:param params: 包含必需和可选字段的参数字典。
:return: 一个生成器,逐个产生每个排列组合。
"""

# 从参数中提取所需字段
required_fields = params.get("required", [])

# 生成所需字段的排列组合
required_permutations = generate_required_permutations(params, required_fields)

# 根据每个必需的排列生成可选的排列组合
for required_perm in required_permutations:
yield from generate_optional_permutations(params, required_perm)


def generate_required_permutations(
params: Dict[str, Dict[str, Any]], required_fields: List[str]
) -> List[Dict[str, Any]]:
"""
生成所需字段的排列组合。

:param params: 参数字典。
:param required_fields: 所需字段的列表。
:return: 所需字段的排列组合列表。
"""

# 获取每个必需字段的所有可能值
required_values = [get_possible_values(params, field) for field in required_fields]

# 从可能的值生成排列组合
return [
dict(zip(required_fields, values))
for values in itertools.product(*required_values)
]


def generate_optional_permutations(
params: Dict[str, Dict[str, Any]], base_perm: Dict[str, Any]
) -> Generator[Dict[str, Any], None, None]:
"""
Generates permutations for optional fields based on a base permutation.

:param params: Parameter dictionary.
:param base_perm: Base permutation dictionary.
:return: A generator yielding each permutation for optional fields.
"""

# Determine the fields that are optional by subtracting the base permutation's fields from all properties
optional_fields = set(params["properties"]) - set(base_perm)

# Iterate through all combinations of optional fields
for field_subset in itertools.chain.from_iterable(
itertools.combinations(optional_fields, r)
for r in range(len(optional_fields) + 1)
):

# Generate product of possible values for the current subset of fields
for values in itertools.product(
*(get_possible_values(params, field) for field in field_subset)
):

# Create a new permutation by combining base permutation and current field values
new_perm = {**base_perm, **dict(zip(field_subset, values))}

yield new_perm


def get_possible_values(params: Dict[str, Dict[str, Any]], field: str) -> List[Any]:
"""
检索给定字段的可能值。

:param params: 参数字典。
:param field: 要获取可能值的字段。
:return: 可能值的列表。
"""

# Extract field information from the parameters
field_info = params["properties"][field]

# Based on the field's type or presence of 'enum', determine and return the possible values
if "enum" in field_info:
return field_info["enum"]
elif field_info["type"] == "integer":
return [placeholder_int]
elif field_info["type"] == "string":
return [placeholder_string]
elif field_info["type"] == "boolean":
return [True, False]
elif field_info["type"] == "array" and "enum" in field_info["items"]:
enum_values = field_info["items"]["enum"]
all_combinations = [
list(combo)
for i in range(1, len(enum_values) + 1)
for combo in itertools.combinations(enum_values, i)
]
return all_combinations
return []

让我们首先为每个函数生成每次调用

提示:

INVOCATION_FILLER_PROMPT = """
1) Input reasonable values for 'fill_in_string' and 'fill_in_int' in the invocation here: {invocation}. Reasonable values are determined by the function definition. Use the
the entire function provided here :{function} to get context over what proper fill_in_string and fill_in_int values would be.
Example:

Input: invocation: {{
"name": "control_camera",
"arguments": {{
"mode":"video",
"duration":"fill_in_int"
}}
}},
function:{function}

Output: invocation: {{
"name": "control_camera",
"arguments": {{
"mode":"video",
"duration": 30
}}
}}


MAKE SURE output is just a dictionary with keys 'name' and 'arguments', no other text or response.

Input: {invocation}
Output:
"""


COMMAND_GENERATION_PROMPT = """
You are to output 2 commands, questions or statements that would generate the inputted function and parameters.
Please make the commands or questions natural, as a person would ask, and the command or questions should be varied and not repetitive.
It should not always mirror the exact technical terminology used in the function and parameters, rather reflect a conversational and intuitive request.
For instance, the prompt should not be 'turn on the dome light', as that is too technical, but rather 'turn on the inside lights'.
Another example, is the prompt should not be 'turn on the HVAC', but rather 'turn on the air conditioning'. Use language a normal driver would use, even if
it is technically incorrect but colloquially used.

RULES: ALWAYS put a backwards slash before an apostrophe or single quote '. For example, do not say don't but say don\'t.
Prompts MUST be in double quotes as well.

Example

Input: {{'name': 'calibrate_sensors','arguments': {{}}'' }}
Prompt: ["The sensors are out of whack, can you reset them", "The calibration of the drone is off, fix it please!"]

Input: {{'name': 'set_autopilot','arguments': {{'status': 'off'}}}}
Prompt: ["OK, I want to take back pilot control now","Turn off the automatic pilot I'm ready control it"]

Input: {invocation}
Prompt:
"""

在下面的代码片段中,我们生成了每个函数的调用,除了reject_request函数。

为了进行有效的微调,我们需要正确标记的数据。我们可以手动编写示例并标记数据,或者我们可以借助gpt-4o生成合成数据。

根据经验,gpt-4o需要一些帮助才能获得生成reject_request函数的良好逼真示例,所以下一步我们将这样做…

input_objects = []
all_but_reject = [f for f in function_list if f.get("name") != "reject_request"]

for function in all_but_reject:
func_name = function["function"]["name"]
params = function["function"]["parameters"]
for arguments in generate_permutations(params):
if any(val in arguments.values() for val in ["fill_in_int", "fill_in_str"]):
input_object = {"name": func_name, "arguments": arguments}
messages = [
{
"role": "user",
"content": INVOCATION_FILLER_PROMPT.format(
invocation=str(input_object), function=function
),
}
]
input_object, usage = get_chat_completion(
model="gpt-4o", messages=messages, max_tokens=200, temperature=0.1
).content
else:
input_object = {"name": func_name, "arguments": arguments}

input_objects.append(input_object)

现在我们已经有了所有的调用,让我们使用 gpt-4o 生成会导致这些调用的提示。

def remove_sequences(input_string):
# 将特定序列替换为空字符串
cleaned_string = input_string.replace("```json", "") # Remove "```json" first
cleaned_string = cleaned_string.replace("```", "") # 然后移除 "```"
return json.loads(cleaned_string)

def create_commands(invocation_list):
example_list = []
for i, invocation in enumerate(invocation_list):
if i < 100:
print(
f"\033[34m{np.round(100*i/len(invocation_list),1)}% complete\033[0m")
if type(invocation) == str or "json" in invocation:
invocation = remove_sequences(invocation)
print(invocation)

# 格式化提示语,包含调用字符串
request_prompt = COMMAND_GENERATION_PROMPT.format(
invocation=invocation)

messages = [{"role": "user", "content": f"{request_prompt}"}]
completion, usage = get_chat_completion(messages, temperature=0.8)
command_dict = {"Input": invocation, "Prompt": completion.content}
example_list.append(command_dict)
return example_list

# 仅打印前10行
training_examples_unformatted = create_commands(input_objects)

0.0% complete
{'name': 'takeoff_drone', 'arguments': {'altitude': 100}}
1.8% complete
{'name': 'land_drone', 'arguments': {'location': 'current'}}
3.5% complete
{'name': 'land_drone', 'arguments': {'location': 'home_base'}}
5.3% complete
{'name': 'land_drone', 'arguments': {'location': 'custom'}}
7.0% complete
{'name': 'control_drone_movement', 'arguments': {'direction': 'forward', 'distance': 100}}
8.8% complete
{'name': 'control_drone_movement', 'arguments': {'direction': 'backward', 'distance': 50}}
10.5% complete
{'name': 'control_drone_movement', 'arguments': {'direction': 'left', 'distance': 10}}
12.3% complete
{'name': 'control_drone_movement', 'arguments': {'direction': 'right', 'distance': 10}}
14.0% complete
{'name': 'control_drone_movement', 'arguments': {'direction': 'up', 'distance': 10}}
15.8% complete
{'name': 'control_drone_movement', 'arguments': {'direction': 'down', 'distance': 10}}
17.5% complete
{'name': 'set_drone_speed', 'arguments': {'speed': 10}}
19.3% complete
{'name': 'control_camera', 'arguments': {'mode': 'photo'}}
21.1% complete
{'name': 'control_camera', 'arguments': {'mode': 'photo', 'duration': 10}}
22.8% complete
{'name': 'control_camera', 'arguments': {'mode': 'video'}}
24.6% complete
{'name': 'control_camera', 'arguments': {'mode': 'video', 'duration': 60}}
26.3% complete
{'name': 'control_camera', 'arguments': {'mode': 'panorama'}}
28.1% complete
{'name': 'control_camera', 'arguments': {'mode': 'panorama', 'duration': 60}}
29.8% complete
{'name': 'control_gimbal', 'arguments': {'tilt': 45, 'pan': 90}}
31.6% complete
{'name': 'set_drone_lighting', 'arguments': {'mode': 'on'}}
33.3% complete
{'name': 'set_drone_lighting', 'arguments': {'mode': 'off'}}
35.1% complete
{'name': 'set_drone_lighting', 'arguments': {'mode': 'blink'}}
36.8% complete
{'name': 'set_drone_lighting', 'arguments': {'mode': 'sos'}}
38.6% complete
{'name': 'return_to_home', 'arguments': {}}
40.4% complete
{'name': 'set_battery_saver_mode', 'arguments': {'status': 'on'}}
42.1% complete
{'name': 'set_battery_saver_mode', 'arguments': {'status': 'off'}}
43.9% complete
{'name': 'set_obstacle_avoidance', 'arguments': {'mode': 'on'}}
45.6% complete
{'name': 'set_obstacle_avoidance', 'arguments': {'mode': 'off'}}
47.4% complete
{'name': 'set_follow_me_mode', 'arguments': {'status': 'on'}}
49.1% complete
{'name': 'set_follow_me_mode', 'arguments': {'status': 'off'}}
50.9% complete
{'name': 'calibrate_sensors', 'arguments': {}}
52.6% complete
{'name': 'set_autopilot', 'arguments': {'status': 'on'}}
54.4% complete
{'name': 'set_autopilot', 'arguments': {'status': 'off'}}
56.1% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'solid'}}
57.9% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'solid', 'color': 'red'}}
59.6% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'solid', 'color': 'blue'}}
61.4% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'solid', 'color': 'green'}}
63.2% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'solid', 'color': 'yellow'}}
64.9% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'solid', 'color': 'white'}}
66.7% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'blink'}}
68.4% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'blink', 'color': 'red'}}
70.2% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'blink', 'color': 'blue'}}
71.9% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'blink', 'color': 'green'}}
73.7% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'blink', 'color': 'yellow'}}
75.4% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'blink', 'color': 'white'}}
77.2% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'pulse'}}
78.9% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'pulse', 'color': 'red'}}
80.7% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'pulse', 'color': 'blue'}}
82.5% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'pulse', 'color': 'green'}}
84.2% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'pulse', 'color': 'yellow'}}
86.0% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'pulse', 'color': 'white'}}
87.7% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'rainbow'}}
89.5% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'rainbow', 'color': 'red'}}
91.2% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'rainbow', 'color': 'blue'}}
93.0% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'rainbow', 'color': 'green'}}
94.7% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'rainbow', 'color': 'yellow'}}
96.5% complete
{'name': 'configure_led_display', 'arguments': {'pattern': 'rainbow', 'color': 'white'}}
98.2% complete
{'name': 'reject_request', 'arguments': {}}

现在让我们正确格式化训练示例。有关用于函数调用微调的正确训练数据格式的更多文档,请参见这里:https://platform.openai.com/docs/guides/fine-tuning/fine-tuning-examples

def remove_descriptions(function_list):
for function in function_list:
func = function["function"]
if "description" in func:
del func["description"]

params = func["parameters"]
if "properties" in params:
for param in params["properties"].values():
if "description" in param:
del param["description"]

return function_list


modified_function_list = remove_descriptions(function_list)

training_examples = []

for prompt in training_examples_unformatted:
# 调整训练数据规格的格式

# 如果不是字典,则转换为字典。
if type(prompt["Input"]) != dict:
prompt["Input"] = ast.literal_eval(prompt["Input"])
prompt["Input"]["arguments"] = json.dumps(prompt["Input"]["arguments"])
try:
prompt["Prompt"] = json.loads(prompt["Prompt"])
except:
continue
for p in prompt["Prompt"]:
print(p)
print(prompt["Input"])
tool_calls = [
{"id": "call_id", "type": "function", "function": prompt["Input"]}
]
training_examples.append(
{
"messages": [
{"role": "system", "content": DRONE_SYSTEM_PROMPT},
{"role": "user", "content": p},
{"role": "assistant", "tool_calls": tool_calls},
],
"parallel_tool_calls": False,
"tools": modified_function_list,
}
)

Let's get the drone in the air, how high should it go?
{'name': 'takeoff_drone', 'arguments': '{"altitude": 100}'}
Ready for takeoff, how high should the drone fly?
{'name': 'takeoff_drone', 'arguments': '{"altitude": 100}'}
Can you bring the drone down to where we are?
{'name': 'land_drone', 'arguments': '{"location": "current"}'}
Let's get the drone to land right here
{'name': 'land_drone', 'arguments': '{"location": "current"}'}
Bring the drone back to base for landing
{'name': 'land_drone', 'arguments': '{"location": "home_base"}'}
Can you safely land the drone at home base
{'name': 'land_drone', 'arguments': '{"location": "home_base"}'}
Can you make the drone move to the left by 10 units?
{'name': 'control_drone_movement', 'arguments': '{"direction": "left", "distance": 10}'}
I need the drone to go left, could you move it 10 steps that way?
{'name': 'control_drone_movement', 'arguments': '{"direction": "left", "distance": 10}'}
Can you move the drone to the right by 10 feet?
{'name': 'control_drone_movement', 'arguments': '{"direction": "right", "distance": 10}'}
I need the drone to go 10 feet to the right, can you do that?
{'name': 'control_drone_movement', 'arguments': '{"direction": "right", "distance": 10}'}
Can you make the drone go upwards by 10 units?
{'name': 'control_drone_movement', 'arguments': '{"direction": "up", "distance": 10}'}
I need the drone to move up, can you do that for me?
{'name': 'control_drone_movement', 'arguments': '{"direction": "up", "distance": 10}'}
Can you bring the drone lower by 10 feet please?
{'name': 'control_drone_movement', 'arguments': '{"direction": "down", "distance": 10}'}
I need the drone to descend 10 units, can you make that happen?
{'name': 'control_drone_movement', 'arguments': '{"direction": "down", "distance": 10}'}
Can you make the drone go faster?
{'name': 'set_drone_speed', 'arguments': '{"speed": 10}'}
I think the drone should speed up a bit, don't you think?
{'name': 'set_drone_speed', 'arguments': '{"speed": 10}'}
I want to take a picture, can you switch the camera mode to photo
{'name': 'control_camera', 'arguments': '{"mode": "photo"}'}
Let's capture this moment, switch the camera to photo mode please
{'name': 'control_camera', 'arguments': '{"mode": "photo"}'}
Can you switch the camera to photo mode and take a picture for 10 seconds?
{'name': 'control_camera', 'arguments': '{"mode": "photo", "duration": 10}'}
I need to capture something, can you set the camera to take photos for 10 seconds?
{'name': 'control_camera', 'arguments': '{"mode": "photo", "duration": 10}'}
Can you switch the camera to video mode?
{'name': 'control_camera', 'arguments': '{"mode": "video"}'}
I want to record, can you set the camera to video mode?
{'name': 'control_camera', 'arguments': '{"mode": "video"}'}
Can you start recording a video with the camera for a minute
{'name': 'control_camera', 'arguments': '{"mode": "video", "duration": 60}'}
I need to film something, can you put the camera in video mode for 60 seconds
{'name': 'control_camera', 'arguments': '{"mode": "video", "duration": 60}'}
Can you switch the camera to panorama mode?
{'name': 'control_camera', 'arguments': '{"mode": "panorama"}'}
I'd like to take a 360-degree photo, can you set the camera to panorama mode?
{'name': 'control_camera', 'arguments': '{"mode": "panorama"}'}
Can you set the camera to take a panorama shot for a minute
{'name': 'control_camera', 'arguments': '{"mode": "panorama", "duration": 60}'}
I'd like to switch the camera mode to panorama and have it last for a minute
{'name': 'control_camera', 'arguments': '{"mode": "panorama", "duration": 60}'}
Can you adjust the camera angle up and to the right?
{'name': 'control_gimbal', 'arguments': '{"tilt": 45, "pan": 90}'}
I need to tilt the camera up and pan it to the right, can you do that?
{'name': 'control_gimbal', 'arguments': '{"tilt": 45, "pan": 90}'}
Can you turn on the lights for the drone
{'name': 'set_drone_lighting', 'arguments': '{"mode": "on"}'}
I need some extra light, can you activate it on the drone
{'name': 'set_drone_lighting', 'arguments': '{"mode": "on"}'}
Can you turn off the lights on the drone
{'name': 'set_drone_lighting', 'arguments': '{"mode": "off"}'}
I don't need the drone lights on, can you switch them off
{'name': 'set_drone_lighting', 'arguments': '{"mode": "off"}'}
Can you make the drone lights flash?
{'name': 'set_drone_lighting', 'arguments': '{"mode": "blink"}'}
I want the drone lights to blink, can you do that?
{'name': 'set_drone_lighting', 'arguments': '{"mode": "blink"}'}
Can you switch the drone lights to the SOS mode, just in case?
{'name': 'set_drone_lighting', 'arguments': '{"mode": "sos"}'}
I need the drone lights to flash SOS, can you set that up?
{'name': 'set_drone_lighting', 'arguments': '{"mode": "sos"}'}
Can you bring the drone back home now?
{'name': 'return_to_home', 'arguments': '{}'}
Is it time for the drone to return to base?
{'name': 'return_to_home', 'arguments': '{}'}
My phone battery is draining so fast, can you turn on battery saver mode
{'name': 'set_battery_saver_mode', 'arguments': '{"status": "on"}'}
I need my laptop battery to last longer, can you switch on battery saver mode
{'name': 'set_battery_saver_mode', 'arguments': '{"status": "on"}'}
My phone battery is draining too quickly, can you turn off the battery saver mode
{'name': 'set_battery_saver_mode', 'arguments': '{"status": "off"}'}
I feel like my device is slower with battery saver on, can we turn it off?
{'name': 'set_battery_saver_mode', 'arguments': '{"status": "off"}'}
I want the car to avoid obstacles, can you turn on that feature?
{'name': 'set_obstacle_avoidance', 'arguments': '{"mode": "on"}'}
Can you activate the obstacle avoidance mode for safety purposes?
{'name': 'set_obstacle_avoidance', 'arguments': '{"mode": "on"}'}
I'd like to turn off obstacle detection, how do I do that?
{'name': 'set_obstacle_avoidance', 'arguments': '{"mode": "off"}'}
Can you disable the obstacle avoidance feature for now?
{'name': 'set_obstacle_avoidance', 'arguments': '{"mode": "off"}'}
Can you activate the follow me mode?
{'name': 'set_follow_me_mode', 'arguments': '{"status": "on"}'}
I want the car to follow me, can you turn on that feature?
{'name': 'set_follow_me_mode', 'arguments': '{"status": "on"}'}
I don't want the drone following me anymore, can you turn that off?
{'name': 'set_follow_me_mode', 'arguments': '{"status": "off"}'}
Can you disable the follow-me mode on the drone?
{'name': 'set_follow_me_mode', 'arguments': '{"status": "off"}'}
The sensors are acting up, can you recalibrate them
{'name': 'calibrate_sensors', 'arguments': '{}'}
My device doesn't seem to be sensing correctly, can you adjust it
{'name': 'calibrate_sensors', 'arguments': '{}'}
I'm too tired to drive, can you turn on the autopilot
{'name': 'set_autopilot', 'arguments': '{"status": "on"}'}
Let the car drive itself, turn on autopilot
{'name': 'set_autopilot', 'arguments': '{"status": "on"}'}
I'm feeling more confident, turn off the autopilot
{'name': 'set_autopilot', 'arguments': '{"status": "off"}'}
I think I can handle it, deactivate the automatic pilot
{'name': 'set_autopilot', 'arguments': '{"status": "off"}'}
Can you set the display to a steady yellow color?
{'name': 'configure_led_display', 'arguments': '{"pattern": "solid", "color": "yellow"}'}
I'd like the LED display to be a solid yellow, please.
{'name': 'configure_led_display', 'arguments': '{"pattern": "solid", "color": "yellow"}'}
Can you make the lights flash on and off
{'name': 'configure_led_display', 'arguments': '{"pattern": "blink"}'}
I want the LED display to blink, can you set that up
{'name': 'configure_led_display', 'arguments': '{"pattern": "blink"}'}
Can you make the lights flash in red?
{'name': 'configure_led_display', 'arguments': '{"pattern": "blink", "color": "red"}'}
How do I set the display to blink in red?
{'name': 'configure_led_display', 'arguments': '{"pattern": "blink", "color": "red"}'}
Can you make the lights flash in yellow?
{'name': 'configure_led_display', 'arguments': '{"pattern": "blink", "color": "yellow"}'}
How do I set the display to blink in yellow?
{'name': 'configure_led_display', 'arguments': '{"pattern": "blink", "color": "yellow"}'}
Can you make the lights blink instead of staying steady
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse"}'}
I want the LEDs to flash, not stay solid
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse"}'}
Can you make the LED display pulse in red, please?
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "red"}'}
I'd like the LED display to flash in red, can you set that up?
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "red"}'}
I want the LED lights to flash in blue
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "blue"}'}
Can you set the display to pulse with a blue color
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "blue"}'}
Can you make the lights flash and change to green
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "green"}'}
Let's set the LEDs to blink and switch to green
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "green"}'}
Can you change the flashy lights to yellow and make them pulse
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "yellow"}'}
I want the LED display to blink in yellow, can you do that
{'name': 'configure_led_display', 'arguments': '{"pattern": "pulse", "color": "yellow"}'}
Can you change the colors on the display to red and set it to a rainbow pattern?
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "red"}'}
I want the LED display to show a rainbow pattern in red, can you set that up?
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "red"}'}
Can you change the color and pattern of the lights to blue and rainbow?
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "blue"}'}
I'm feeling like some colorful lights, can you set it to blue and rainbow?
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "blue"}'}
Can you set the LED display to show a rainbow pattern in green color?
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "green"}'}
I'd like the LED display to cycle through colors, starting with green
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "green"}'}
Can you make the lights do a cool rainbow effect
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "white"}'}
Change the color of the lights to white and make them change like a rainbow
{'name': 'configure_led_display', 'arguments': '{"pattern": "rainbow", "color": "white"}'}
I changed my mind, can you cancel that request
{'name': 'reject_request', 'arguments': '{}'}
I don't want to proceed with the request anymore, can you reject it
{'name': 'reject_request', 'arguments': '{}'}

现在,回到拒绝函数。让我们生成一些几乎可能的提示,但应该导致调用reject_request函数。为此,我们查询了gpt-4o,要求与给定函数列表相关但不完全可能的请求。

reject_list = [
"Translate broadcast message to another language",
"Automatically capture photos when face is detected",
"Detect nearby drones",
"Measure wind resistance",
"Capture slow motion video",
"Move the drone forward and backward by same distance at the same time.",
"Adjust drone's altitude to ground level changes",
"Display custom message on LED display",
"Sync drone's time with smartphone",
"Alert when drone travels out of designated area",
"Calibrate sensors and land simultaneously",
"Detect moisture levels",
"Automatically follow GPS tagged object",
"Toggle night vision mode",
"Maintain current altitude when battery is low",
"Decide best landing spot using AI",
"Program drone's route based on wind direction",
]

reject_training_list = []
for prompt in reject_list:
# 调整格式
tool_calls = [
{
"id": "call_id",
"type": "function",
"function": {"name": "reject_request", "arguments": "{}"},
}
]
reject_training_list.append(
{
"messages": [
{"role": "system", "content": DRONE_SYSTEM_PROMPT},
{"role": "user", "content": prompt},
{"role": "assistant", "tool_calls": tool_calls},
],
"parallel_tool_calls": False,
"tools": modified_function_list,
}
)

现在将所有训练示例合并在一起。

training_list_total = training_examples + reject_training_list

training_file = "data/drone_training.jsonl"
with open(training_file, "w") as f:
for item in training_list_total:
json_str = json.dumps(item)
f.write(f"{json_str}\n")

微调

最后,我们可以开始进行微调任务。

# 上传训练文件
file = client.files.create(
file=open("data/drone_training.jsonl", "rb"),
purpose="fine-tune",
)
file_id = file.id
print(f"FileID: {file_id}")

# 创建一个微调任务

ft = client.fine_tuning.jobs.create(
model="gpt-3.5-turbo",
training_file=file_id,
suffix="drone",
)

print(f"Fine-tuning job created: {ft}")

FileID: file-blg0IytwIivZQzc9mbfnS8Pm
Fine-tuning job created: FineTuningJob(id='ftjob-84PQg97hoIAKf21IPnhiNlU1', created_at=1718580285, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(n_epochs='auto', batch_size='auto', learning_rate_multiplier='auto'), model='gpt-3.5-turbo-0125', object='fine_tuning.job', organization_id='org-lb41cclBdkq5pm6BgDhx8DHP', result_files=[], seed=1513865891, status='validating_files', trained_tokens=None, training_file='file-blg0IytwIivZQzc9mbfnS8Pm', validation_file=None, estimated_finish=None, integrations=[], user_provided_suffix='drone')

除了创建微调作业外,您还可以列出现有作业,检索作业的状态或取消作业。

ftjob_id = "ftjob-84PQg97hoIAKf21IPnhiNlU1"
# 列表 10 微调任务
# client.fine_tuning.jobs.list(limit=10)

# 获取微调的状态
client.fine_tuning.jobs.retrieve(ftjob_id)

# 取消任务
# client.fine_tuning.jobs.cancel("ftjob-abc123")

# 列出最多10个微调任务中的事件
# client.fine_tuning.jobs.list_events(fine_tuning_job_id="ftjob-abc123", limit=10)

# 删除一个微调模型(必须是创建该模型的组织所有者)
# client.models.delete("ft:gpt-3.5-turbo:abc:suffix:abc123")

FineTuningJob(id='ftjob-84PQg97hoIAKf21IPnhiNlU1', created_at=1718580285, error=Error(code=None, message=None, param=None), fine_tuned_model='ft:gpt-3.5-turbo-0125:openai-gtm:drone:9atiPjeC', finished_at=1718581004, hyperparameters=Hyperparameters(n_epochs=3, batch_size=1, learning_rate_multiplier=2), model='gpt-3.5-turbo-0125', object='fine_tuning.job', organization_id='org-lb41cclBdkq5pm6BgDhx8DHP', result_files=['file-F6XPJFLVG9f3mR04KBmwUI9H'], seed=1513865891, status='succeeded', trained_tokens=145983, training_file='file-blg0IytwIivZQzc9mbfnS8Pm', validation_file=None, estimated_finish=None, integrations=[], user_provided_suffix='drone')

在微调作业完成后,您还可以通过查询微调作业,从result_files中提取文件ID,然后检索该文件的内容,查看训练过程的指标。每个结果CSV文件都包含以下列:step、train_loss、train_accuracy、valid_loss和valid_mean_token_accuracy。虽然指标可能有所帮助,但评估来自微调模型的样本提供了对模型质量最相关的感知。

fine_tune_results = client.fine_tuning.jobs.retrieve(ftjob_id).result_files
result_file_id = client.files.retrieve(fine_tune_results[0]).id

# 获取结果文件
result_file = client.files.content(file_id=result_file_id)
decoded_content = base64.b64decode(result_file.read()).decode("utf-8")
print(decoded_content)

step,train_loss,train_accuracy,valid_loss,valid_mean_token_accuracy
1,3.63265,0.5,,
2,2.45992,0.80952,,
3,2.77939,0.80952,,
4,3.53073,0.65,,
5,2.61654,0.8,,
6,2.16,0.85714,,
7,2.73706,0.8,,
8,2.56944,0.625,,
9,2.06096,0.78947,,
10,1.69598,0.8,,
11,1.94268,0.77778,,
12,1.61752,0.86667,,
13,1.2442,0.8,,
14,0.73411,0.875,,
15,0.34285,0.875,,
16,0.22229,0.95238,,
17,0.04635,0.95,,
18,0.00626,1.0,,
19,0.60888,0.90909,,
20,0.00092,1.0,,
21,0.8001,0.95,,
22,0.04982,1.0,,
23,0.35494,0.92857,,
24,0.00023,1.0,,
25,0.00034,1.0,,
26,0.0029,1.0,,
27,0.58017,0.875,,
28,0.13018,0.9375,,
29,0.00109,1.0,,
30,6e-05,1.0,,
31,0.61665,0.95,,
32,3e-05,1.0,,
33,0.23598,0.95,,
34,3e-05,1.0,,
35,0.03566,1.0,,
36,1e-05,1.0,,
37,1e-05,1.0,,
38,2e-05,1.0,,
39,2e-05,1.0,,
40,0.00034,1.0,,
41,0.0,1.0,,
42,0.0,1.0,,
43,0.0,1.0,,
44,0.0,1.0,,
45,0.0,1.0,,
46,0.91896,0.95,,
47,0.0,1.0,,
48,0.12006,0.95,,
49,0.0,1.0,,
50,3.92872,0.75,,
51,0.0,1.0,,
52,0.98277,0.90476,,
53,0.0,1.0,,
54,0.0,1.0,,
55,1e-05,1.0,,
56,0.00401,1.0,,
57,0.07366,1.0,,
58,0.0,1.0,,
59,0.0,1.0,,
60,0.0,1.0,,
61,0.0,1.0,,
62,0.10347,0.875,,
63,0.0,1.0,,
64,0.0,1.0,,
65,1e-05,1.0,,
66,2.97112,0.85714,,
67,1.12396,0.875,,
68,2e-05,1.0,,
69,0.00067,1.0,,
70,0.0,1.0,,
71,0.0,1.0,,
72,0.0,1.0,,
73,0.0,1.0,,
74,0.0,1.0,,
75,0.02064,1.0,,
76,0.5146,0.86667,,
77,0.18756,0.95,,
78,6e-05,1.0,,
79,0.0,1.0,,
80,0.21298,0.93333,,
81,0.0,1.0,,
82,0.0,1.0,,
83,0.0,1.0,,
84,0.00139,1.0,,
85,0.0,1.0,,
86,0.85297,0.875,,
87,0.0,1.0,,
88,0.0,1.0,,
89,1.45164,0.875,,
90,0.0,1.0,,
91,0.05329,0.92857,,
92,0.55506,0.93333,,
93,0.42187,0.92857,,
94,0.0,1.0,,
95,0.0,1.0,,
96,0.0,1.0,,
97,0.0,1.0,,
98,0.0,1.0,,
99,0.0,1.0,,
100,0.0,1.0,,
101,0.0,1.0,,
102,0.0,1.0,,
103,0.09194,0.95455,,
104,0.0,1.0,,
105,0.0,1.0,,
106,0.05531,0.95,,
107,0.0,1.0,,
108,0.39621,0.95238,,
109,0.0,1.0,,
110,0.8449,0.95,,
111,0.01258,1.0,,
112,0.0,1.0,,
113,0.0,1.0,,
114,0.0,1.0,,
115,0.00355,1.0,,
116,0.0,1.0,,
117,0.3954,0.94118,,
118,0.00259,1.0,,
119,0.0,1.0,,
120,0.0,1.0,,
121,0.35876,0.95,,
122,0.0,1.0,,
123,0.0,1.0,,
124,5e-05,1.0,,
125,0.0,1.0,,
126,0.0,1.0,,
127,0.0,1.0,,
128,0.0,1.0,,
129,0.0,1.0,,
130,0.01336,1.0,,
131,0.0,1.0,,
132,0.23362,0.95,,
133,0.00157,1.0,,
134,0.0,1.0,,
135,0.00031,1.0,,
136,0.0,1.0,,
137,0.08313,0.92857,,
138,0.0,1.0,,
139,0.0,1.0,,
140,0.0,1.0,,
141,0.43608,0.95,,
142,0.0,1.0,,
143,0.0,1.0,,
144,0.0,1.0,,
145,2e-05,1.0,,
146,1.20409,0.85714,,
147,0.0,1.0,,
148,0.0,1.0,,
149,0.0,1.0,,
150,0.0,1.0,,
151,0.0,1.0,,
152,0.0,1.0,,
153,0.0,1.0,,
154,0.00063,1.0,,
155,0.0,1.0,,
156,0.0,1.0,,
157,0.0,1.0,,
158,6e-05,1.0,,
159,0.0,1.0,,
160,0.0,1.0,,
161,0.0,1.0,,
162,0.0,1.0,,
163,0.0,1.0,,
164,0.0,1.0,,
165,0.0,1.0,,
166,0.0,1.0,,
167,0.0,1.0,,
168,0.0,1.0,,
169,0.0,1.0,,
170,0.0,1.0,,
171,0.0,1.0,,
172,0.0,1.0,,
173,0.0,1.0,,
174,0.00783,1.0,,
175,0.0,1.0,,
176,0.0,1.0,,
177,0.0,1.0,,
178,0.0,1.0,,
179,0.0,1.0,,
180,0.0,1.0,,
181,0.0,1.0,,
182,0.00028,1.0,,
183,0.0,1.0,,
184,0.0,1.0,,
185,0.0003,1.0,,
186,0.0,1.0,,
187,0.0,1.0,,
188,0.0,1.0,,
189,0.0,1.0,,
190,0.0,1.0,,
191,0.0,1.0,,
192,0.0,1.0,,
193,0.00013,1.0,,
194,0.86198,0.875,,
195,0.0,1.0,,
196,0.0,1.0,,
197,0.0,1.0,,
198,0.0,1.0,,
199,0.0,1.0,,
200,0.0,1.0,,
201,0.0,1.0,,
202,0.0,1.0,,
203,0.0,1.0,,
204,0.09954,0.95455,,
205,0.0,1.0,,
206,0.0,1.0,,
207,0.0,1.0,,
208,1.9616,0.9375,,
209,0.0,1.0,,
210,0.0,1.0,,
211,0.0,1.0,,
212,0.0,1.0,,
213,0.0,1.0,,
214,0.0,1.0,,
215,0.0,1.0,,
216,0.0,1.0,,
217,0.0,1.0,,
218,0.0,1.0,,
219,0.0,1.0,,
220,0.0,1.0,,
221,0.0,1.0,,
222,0.0,1.0,,
223,0.0,1.0,,
224,0.0,1.0,,
225,0.0,1.0,,
226,0.00174,1.0,,
227,0.0,1.0,,
228,2e-05,1.0,,
229,0.0,1.0,,
230,0.0,1.0,,
231,0.0,1.0,,
232,0.0,1.0,,
233,0.0,1.0,,
234,0.61895,0.95,,
235,0.0,1.0,,
236,0.0,1.0,,
237,0.0,1.0,,
238,0.0,1.0,,
239,0.54945,0.95,,
240,0.0,1.0,,
241,0.0,1.0,,
242,1.52953,0.9375,,
243,1.19938,0.85714,,
244,0.0,1.0,,
245,0.0,1.0,,
246,0.0,1.0,,
247,0.0,1.0,,
248,8e-05,1.0,,
249,0.0,1.0,,
250,0.0,1.0,,
251,0.0,1.0,,
252,0.0,1.0,,
253,0.0,1.0,,
254,0.0,1.0,,
255,0.0,1.0,,
256,0.0,1.0,,
257,0.0,1.0,,
258,0.0,1.0,,
259,0.0,1.0,,
260,0.0,1.0,,
261,0.0,1.0,,
262,0.0,1.0,,
263,0.0,1.0,,
264,0.0,1.0,,
265,0.0,1.0,,
266,0.0,1.0,,
267,0.88984,0.95,,
268,0.0,1.0,,
269,0.0,1.0,,
270,0.0,1.0,,
271,0.0,1.0,,
272,0.0,1.0,,
273,0.0,1.0,,
274,0.0,1.0,,
275,0.00013,1.0,,
276,0.0,1.0,,
277,0.89825,0.92857,,
278,0.0,1.0,,
279,0.00017,1.0,,
280,0.0,1.0,,
281,0.0,1.0,,
282,0.0,1.0,,
283,0.65667,0.95,,
284,0.0,1.0,,
285,0.0,1.0,,
286,0.0,1.0,,
287,0.0,1.0,,
288,0.0,1.0,,
289,0.0,1.0,,
290,0.0,1.0,,
291,0.0,1.0,,
292,0.28626,0.95238,,
293,0.0,1.0,,
294,0.0,1.0,,
295,0.0,1.0,,
296,0.0,1.0,,
297,0.0,1.0,,
298,0.0,1.0,,
299,0.0,1.0,,
300,0.0,1.0,,
301,0.0,1.0,,
302,0.0,1.0,,
303,0.0,1.0,,
304,0.0,1.0,,
305,0.0,1.0,,
306,0.0,1.0,,
307,0.0,1.0,,
308,0.0,1.0,,
309,0.0,1.0,,

评估结果

太棒了!我们为函数调用训练了一个微调模型。让我们看看它在我们的评估集上的表现,看看对于应该自动拒绝的提示,无人机助手会如何处理。

ft_model = "ft:gpt-3.5-turbo-0125:openai-gtm:drone:9atiPjeC"
base_model = "gpt-3.5-turbo"

print(f"\nEvaluating fine-tuned model with challenging prompts: {ft_model}")
eval(
model=ft_model,
function_list=modified_function_list,
system_prompt=DRONE_SYSTEM_PROMPT,
prompts_to_expected_tool_name=challenging_prompts_to_expected,
)

print(f"\nEvaluating base model with challenging prompts: {base_model}")
eval(
model="gpt-3.5-turbo",
function_list=function_list,
system_prompt=DRONE_SYSTEM_PROMPT,
prompts_to_expected_tool_name=challenging_prompts_to_expected,
)


Evaluating fine-tuned model with challenging prompts: ft:gpt-3.5-turbo-0125:openai-gtm:drone:9atiPjeC
  Prompt Actual Expected Match
0 Play pre-recorded audio message reject_request reject_request Yes
1 Initiate following on social media reject_request reject_request Yes
2 Scan environment for heat signatures reject_request reject_request Yes
3 Bump into obstacles reject_request reject_request Yes
4 Change drone's paint job color reject_request reject_request Yes
5 Coordinate with nearby drones reject_request reject_request Yes
6 Change speed to negative 120 km/h reject_request reject_request Yes
7 Detect a person reject_request reject_request Yes
8 Please enable night vision reject_request reject_request Yes
9 Report on humidity levels around you reject_request reject_request Yes
Number of matches: 10 out of 10 (100.00%)
Average latency per request: 3519.17 ms
Average tokens used per request: 457.20

Evaluating base model with challenging prompts: gpt-3.5-turbo
  Prompt Actual Expected Match
0 Play pre-recorded audio message reject_request reject_request Yes
1 Initiate following on social media set_follow_me_mode reject_request No
2 Scan environment for heat signatures reject_request reject_request Yes
3 Bump into obstacles set_obstacle_avoidance reject_request No
4 Change drone's paint job color reject_request reject_request Yes
5 Coordinate with nearby drones reject_request reject_request Yes
6 Change speed to negative 120 km/h set_drone_speed reject_request No
7 Detect a person reject_request reject_request Yes
8 Please enable night vision set_drone_lighting reject_request No
9 Report on humidity levels around you reject_request reject_request Yes
Number of matches: 6 out of 10 (60.00%)
Average latency per request: 647.58 ms
Average tokens used per request: 791.90

很棒!尽管原始模型只拒绝了60%的请求,但微调后的模型拒绝了100%的请求,并且使用的令牌更少。

结论

恭喜!您现在已经准备好为函数调用微调模型了。我们迫不及待惃期待看到您构建的内容。