结构化提取的函数调用程序¶
本指南将向您展示如何使用我们的FunctionCallingProgram
进行结构化数据提取。给定一个函数调用的LLM以及一个输出的Pydantic类,可以生成一个结构化的Pydantic对象。我们使用三种不同的函数调用LLM:
- OpenAI
- Anthropic Claude
- Mistral
在目标对象方面,您可以选择直接指定output_cls
,或者指定一个PydanticOutputParser
或任何其他生成Pydantic对象的BaseOutputParser
。
在下面的示例中,我们将展示不同的提取方式,将其提取到Album
对象中(其中可以包含一系列的Song
对象)。
注意:FunctionCallingProgram
仅适用于本身支持函数调用的LLM,通过将Pydantic对象的模式插入为工具的“工具参数”来实现。对于所有其他LLM,请使用我们的LLMTextCompletionProgram
,它将直接通过文本提示模型返回结构化输出。
定义Album
类¶
这是一个简单的示例,将输出解析为一个Album
模式,其中可以包含多首歌曲。
只需在初始化FunctionCallingProgram
时将Album
传递给output_cls
属性即可。
如果您在colab上打开这个笔记本,您可能需要安装LlamaIndex 🦙。
In [ ]:
Copied!
!pip install llama-index
!pip install llama-index
In [ ]:
Copied!
from pydantic import BaseModel
from typing import List
from llama_index.core.program import FunctionCallingProgram
from pydantic import BaseModel
from typing import List
from llama_index.core.program import FunctionCallingProgram
定义输出模式
In [ ]:
Copied!
class Song(BaseModel): """歌曲的数据模型。""" title: str # 标题 length_seconds: int # 时长(秒)class Album(BaseModel): """专辑的数据模型。""" name: str # 名称 artist: str # 艺术家 songs: List[Song] # 歌曲列表
class Song(BaseModel): """歌曲的数据模型。""" title: str # 标题 length_seconds: int # 时长(秒)class Album(BaseModel): """专辑的数据模型。""" name: str # 名称 artist: str # 艺术家 songs: List[Song] # 歌曲列表
函数调用(单个对象)¶
In [ ]:
Copied!
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.openai import OpenAI
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.openai import OpenAI
In [ ]:
Copied!
prompt_template_str = """\生成一个示例专辑,包括一个艺术家和一组歌曲。以电影 {movie_name} 为灵感。\"""llm = OpenAI(model="gpt-3.5-turbo")program = FunctionCallingProgram.from_defaults( output_cls=Album, prompt_template_str=prompt_template_str, verbose=True,)
prompt_template_str = """\生成一个示例专辑,包括一个艺术家和一组歌曲。以电影 {movie_name} 为灵感。\"""llm = OpenAI(model="gpt-3.5-turbo")program = FunctionCallingProgram.from_defaults( output_cls=Album, prompt_template_str=prompt_template_str, verbose=True,)
把程序运行起来,以获得结构化的输出。
In [ ]:
Copied!
output = program(movie_name="The Shining")
output = program(movie_name="The Shining")
=== Calling Function === Calling function: Album with args: {"name": "The Shining Soundtrack", "artist": "Various Artists", "songs": [{"title": "Main Title", "length_seconds": 180}, {"title": "Rocky Mountains", "length_seconds": 240}, {"title": "Lullaby", "length_seconds": 200}, {"title": "The Overlook Hotel", "length_seconds": 220}, {"title": "Grady's Story", "length_seconds": 180}, {"title": "The Maze", "length_seconds": 210}]} === Function Output === name='The Shining Soundtrack' artist='Various Artists' songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title="Grady's Story", length_seconds=180), Song(title='The Maze', length_seconds=210)]
输出是一个有效的Pydantic对象,我们可以使用它来调用函数/API。
In [ ]:
Copied!
output
output
Out[ ]:
Album(name='The Shining Soundtrack', artist='Various Artists', songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title="Grady's Story", length_seconds=180), Song(title='The Maze', length_seconds=210)])
函数调用(并行函数调用,多个对象)¶
In [ ]:
Copied!
prompt_template_str = """\使用以下每部电影作为灵感,生成带有艺术家和歌曲列表的示例专辑。以下是电影:{movie_names}"""llm = OpenAI(model="gpt-3.5-turbo")program = FunctionCallingProgram.from_defaults( output_cls=Album, prompt_template_str=prompt_template_str, verbose=True, allow_parallel_tool_calls=True,)output = program(movie_names="The Shining, The Blair Witch Project, Saw")
prompt_template_str = """\使用以下每部电影作为灵感,生成带有艺术家和歌曲列表的示例专辑。以下是电影:{movie_names}"""llm = OpenAI(model="gpt-3.5-turbo")program = FunctionCallingProgram.from_defaults( output_cls=Album, prompt_template_str=prompt_template_str, verbose=True, allow_parallel_tool_calls=True,)output = program(movie_names="The Shining, The Blair Witch Project, Saw")
=== Calling Function === Calling function: Album with args: {"name": "The Shining", "artist": "Various Artists", "songs": [{"title": "Main Theme", "length_seconds": 180}, {"title": "The Overlook Hotel", "length_seconds": 240}, {"title": "Redrum", "length_seconds": 200}]} === Function Output === name='The Shining' artist='Various Artists' songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)] === Calling Function === Calling function: Album with args: {"name": "The Blair Witch Project", "artist": "Soundtrack Ensemble", "songs": [{"title": "Into the Woods", "length_seconds": 210}, {"title": "The Rustling Leaves", "length_seconds": 180}, {"title": "The Witch's Curse", "length_seconds": 240}]} === Function Output === name='The Blair Witch Project' artist='Soundtrack Ensemble' songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title="The Witch's Curse", length_seconds=240)] === Calling Function === Calling function: Album with args: {"name": "Saw", "artist": "Horror Soundscapes", "songs": [{"title": "The Reverse Bear Trap", "length_seconds": 220}, {"title": "Jigsaw's Game", "length_seconds": 260}, {"title": "Bathroom Escape", "length_seconds": 180}]} === Function Output === name='Saw' artist='Horror Soundscapes' songs=[Song(title='The Reverse Bear Trap', length_seconds=220), Song(title="Jigsaw's Game", length_seconds=260), Song(title='Bathroom Escape', length_seconds=180)]
In [ ]:
Copied!
output
output
Out[ ]:
[Album(name='The Shining', artist='Various Artists', songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)]), Album(name='The Blair Witch Project', artist='Soundtrack Ensemble', songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title="The Witch's Curse", length_seconds=240)]), Album(name='Saw', artist='Horror Soundscapes', songs=[Song(title='The Reverse Bear Trap', length_seconds=220), Song(title="Jigsaw's Game", length_seconds=260), Song(title='Bathroom Escape', length_seconds=180)])]
使用Anthropic进行函数调用程序¶
在这里,我们使用Claude Sonnet(所有三个模型都支持函数调用)。
In [ ]:
Copied!
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.anthropic import Anthropic
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.anthropic import Anthropic
In [ ]:
Copied!
prompt_template_str = "Generate a song about {topic}."
llm = Anthropic(model="claude-3-sonnet-20240229")
program = FunctionCallingProgram.from_defaults(
output_cls=Song,
prompt_template_str=prompt_template_str,
llm=llm,
verbose=True,
)
prompt_template_str = "Generate a song about {topic}."
llm = Anthropic(model="claude-3-sonnet-20240229")
program = FunctionCallingProgram.from_defaults(
output_cls=Song,
prompt_template_str=prompt_template_str,
llm=llm,
verbose=True,
)
In [ ]:
Copied!
output = program(topic="harry potter")
output = program(topic="harry potter")
=== Calling Function === Calling function: Song with args: {"title": "The Boy Who Lived", "length_seconds": 180} === Function Output === title='The Boy Who Lived' length_seconds=180
In [ ]:
Copied!
output
output
Out[ ]:
Song(title='The Boy Who Lived', length_seconds=180)
使用Mistral进行函数调用程序¶
这里我们使用mistral-large。
In [ ]:
Copied!
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.mistralai import MistralAI
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.mistralai import MistralAI
In [ ]:
Copied!
# prompt_template_str = """\# 生成一个示例专辑,包括一个艺术家和一组歌曲。\# 以百老汇音乐剧 {broadway_show} 为灵感。\# 确保使用该工具。# """prompt_template_str = "生成一首关于 {topic} 的歌曲。"llm = MistralAI(model="mistral-large-latest")program = FunctionCallingProgram.from_defaults( output_cls=Song, prompt_template_str=prompt_template_str, llm=llm, verbose=True,)
# prompt_template_str = """\# 生成一个示例专辑,包括一个艺术家和一组歌曲。\# 以百老汇音乐剧 {broadway_show} 为灵感。\# 确保使用该工具。# """prompt_template_str = "生成一首关于 {topic} 的歌曲。"llm = MistralAI(model="mistral-large-latest")program = FunctionCallingProgram.from_defaults( output_cls=Song, prompt_template_str=prompt_template_str, llm=llm, verbose=True,)
In [ ]:
Copied!
output = program(topic="the broadway show Wicked")
output = program(topic="the broadway show Wicked")
=== Calling Function === Calling function: Song with args: {"title": "Defying Gravity", "length_seconds": 240} === Function Output === title='Defying Gravity' length_seconds=240
In [ ]:
Copied!
output
output
Out[ ]:
Song(title='Defying Gravity', length_seconds=240)
In [ ]:
Copied!
from llama_index.core.output_parsers import PydanticOutputParser
program = LLMTextCompletionProgram.from_defaults(
output_parser=PydanticOutputParser(output_cls=Album),
prompt_template_str=prompt_template_str,
verbose=True,
)
from llama_index.core.output_parsers import PydanticOutputParser
program = LLMTextCompletionProgram.from_defaults(
output_parser=PydanticOutputParser(output_cls=Album),
prompt_template_str=prompt_template_str,
verbose=True,
)
In [ ]:
Copied!
output = program(movie_name="Lord of the Rings")
output
output = program(movie_name="Lord of the Rings")
output
Out[ ]:
Album(name='The Fellowship of the Ring', artist='Middle-earth Ensemble', songs=[Song(title='The Shire', length_seconds=240), Song(title='Concerning Hobbits', length_seconds=180), Song(title='The Ring Goes South', length_seconds=300), Song(title='A Knife in the Dark', length_seconds=270), Song(title='Flight to the Ford', length_seconds=210), Song(title='Many Meetings', length_seconds=240), Song(title='The Council of Elrond', length_seconds=330), Song(title='The Great Eye', length_seconds=180), Song(title='The Breaking of the Fellowship', length_seconds=360)])