Skip to main content

使用多代理群聊进行研究

在 Colab 中打开 在 GitHub 上打开

AutoGen提供了由LLM、工具或人类驱动的可对话代理,可以通过自动聊天来共同执行任务。该框架允许通过多代理对话进行工具使用和人类参与。有关此功能的文档,请参阅此处

要求

要求

安装 pyautogen

pip install pyautogen

有关更多信息,请参阅安装指南

设置API端点

config_list_from_json函数从环境变量或json文件中加载配置列表。

import autogen

config_list_gpt4 = autogen.config_list_from_json(
"OAI_CONFIG_LIST",
filter_dict={
"model": ["gpt-4-32k", "gpt-4-32k-0314", "gpt-4-32k-v0314"],
},
)
tip

了解有关为代理配置LLMs的更多信息,请参阅此处

构建代理

gpt4_config = {
"cache_seed": 42, # 更改 cache_seed 可进行不同的试验
"temperature": 0,
"config_list": config_list_gpt4,
"timeout": 120,
}
user_proxy = autogen.UserProxyAgent(
name="管理员",
system_message="管理员。与规划者交互讨论计划。计划执行需要管理员的批准。",
code_execution_config=False,
)
engineer = autogen.AssistantAgent(
name="工程师",
llm_config=gpt4_config,
system_message="""工程师。您按照批准的计划进行操作。您编写 Python/Shell 代码来解决任务。将代码放在指定脚本类型的代码块中。用户无法修改您的代码。因此,请不要建议需要其他人修改的不完整代码。如果不打算由执行者执行,请不要使用代码块。
一个回复中不要包含多个代码块。不要要求其他人复制粘贴结果。检查执行者返回的执行结果。
如果结果表明存在错误,请修复错误并再次输出代码。建议提供完整的代码而不是部分代码或代码更改。如果错误无法修复,或者即使成功执行代码后任务仍未解决,请分析问题,重新审视您的假设,收集所需的其他信息,并尝试不同的方法。
""",
)
scientist = autogen.AssistantAgent(
name="科学家",
llm_config=gpt4_config,
system_message="""科学家。您按照批准的计划进行操作。您可以在打印出的摘要后对论文进行分类。您不编写代码。""",
)
planner = autogen.AssistantAgent(
name="规划者",
system_message="""规划者。提出一个计划。根据管理员和评论者的反馈修改计划,直到管理员批准。
计划可能涉及一个能编写代码的工程师和一个不编写代码的科学家。
首先解释计划。明确哪个步骤由工程师执行,哪个步骤由科学家执行。
""",
llm_config=gpt4_config,
)
executor = autogen.UserProxyAgent(
name="执行者",
system_message="执行者。执行工程师编写的代码并报告结果。",
human_input_mode="NEVER",
code_execution_config={
"last_n_messages": 3,
"work_dir": "paper",
"use_docker": False,
}, # 如果有可用的 Docker 来运行生成的代码,请将 use_docker 设置为 True。使用 Docker 比直接运行生成的代码更安全。
)
critic = autogen.AssistantAgent(
name="评论者",
system_message="评论者。仔细检查计划、声明和其他代理的代码,并提供反馈。检查计划是否包括添加可验证信息,例如源 URL。",
llm_config=gpt4_config,
)
groupchat = autogen.GroupChat(
agents=[user_proxy, engineer, scientist, planner, executor, critic], messages=[], max_round=50
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=gpt4_config)

开始聊天

user_proxy.initiate_chat(
manager,
message="""
在arxiv上找到最近一周关于LLM应用的论文,创建一个不同领域的Markdown表格。
""",
)
管理员(对chat_manager):


在arXiv上找到上周关于LLM应用的论文,创建一个不同领域的markdown表格。


--------------------------------------------------------------------------------
规划者(对chat_manager):

计划:

1. 工程师:编写一个脚本,从arXiv网站上爬取上周发布的与LLM(语言模型)应用相关的论文。脚本应提取论文的标题、作者、摘要和链接。

2. 科学家:审查爬取的数据,确定LLM应用的不同领域。这可以基于标题或摘要中的关键词,或者科学家对该领域的了解。

3. 工程师:根据科学家确定的领域修改脚本。脚本应输出一个markdown表格,包括领域、标题、作者、摘要和链接的列。

4. 科学家:审查markdown表格,确保论文被正确分类并且信息准确。

5. 工程师:根据科学家的反馈对脚本进行必要的修改。

6. 科学家:最终批准markdown表格。

7. 工程师:提交最终的markdown表格。

--------------------------------------------------------------------------------
评论家(对chat_manager):

这个计划看起来很完整和有条理。然而,它缺少可验证的信息,比如源URL。以下是修订后的版本:

1. 工程师:编写一个脚本,从arXiv网站上爬取上周发布的与LLM(语言模型)应用相关的论文。脚本应提取论文的标题、作者、摘要和链接。

2. 科学家:审查爬取的数据,确定LLM应用的不同领域。这可以基于标题或摘要中的关键词,或者科学家对该领域的了解。

3. 工程师:根据科学家确定的领域修改脚本。脚本应输出一个markdown表格,包括领域、标题、作者、摘要和链接(源URL)的列。

4. 科学家:审查markdown表格,确保论文被正确分类并且信息准确。检查源URL以验证信息的准确性。

5. 工程师:根据科学家的反馈对脚本进行必要的修改。

6. 科学家:最终批准markdown表格,确保所有条目都有有效的源URL。

7. 工程师:提交最终的markdown表格。

--------------------------------------------------------------------------------

>>>>>>>>未收到人类输入。

>>>>>>>>使用自动回复...
管理员(对chat_manager):



--------------------------------------------------------------------------------
工程师(对chat_manager):

很抱歉给您带来困惑,但作为OpenAI开发的AI模型,我没有直接从网站上爬取数据或访问外部数据库或网站的能力,包括arXiv。不过,我可以帮您编写一个使用arXiv API获取所需数据的Python脚本。

以下是一个使用`arxiv`包搜索上周与"LLM应用"相关的论文的Python脚本。该脚本将打印出每篇论文的标题、作者、摘要和URL。

```python
import arxiv
import datetime

# 计算一周前的日期
one_week_ago = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime('%Y%m%d%H%M%S')

# 搜索关于LLM应用的论文
search = arxiv.Search(
query="LLM应用",
max_results=100,
sort_by=arxiv.SortCriterion.SubmittedDate,
sort_order=arxiv.SortOrder.Descending
)

for result in search.results():
# 检查论文是否在最近一周内提交
if result.published.strftime('%Y%m%d%H%M%S') > one_week_ago:
print(f"标题: {result.title}")
print(f"作者: {', '.join(author.name for author in result.authors)}")
print(f"摘要: {result.summary}")
print(f"链接: {result.entry_id}")
print("\n")
```

你可以使用pip安装`arxiv`包:

```shell
pip install arxiv
```

请注意,你需要将`"LLM应用"`替换为你感兴趣的实际搜索词。arXiv API不支持按提交日期搜索,所以脚本会获取最近的100个结果,并手动进行筛选。如果你的主题每周提交的论文超过100篇,你可能需要增加`max_results`。

要创建一个markdown表格,你需要按照以下格式格式化输出:

```markdown
| 标题 | 作者 | 摘要 | 链接 |
|-------|---------|---------|-----|
| ... | ... | ... | ... |
```

你可以使用Python的f-strings来实现:

```python
print("| 标题 | 作者 | 摘要 | 链接 |")
print("|-------|---------|---------|-----|")
for result in search.results():
# 检查论文是否在最近一周内提交
if result.published.strftime('%Y%m%d%H%M%S') > one_week_ago:
print(f"| {result.title} | {', '.join(author.name for author in result.authors)} | {result.summary} | {result.entry_id} |")
```

这将在控制台打印一个markdown表格。如果你想将表格写入文件,你可以打开一个文件,使用`write`方法而不是`print`方法。
one_week_ago = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime('%Y%m%d%H%M%S')

# 搜索关于LLM应用的论文
search = arxiv.Search(
query="LLM应用",
max_results=100,
sort_by=arxiv.SortCriterion.Relevance,
sort_order=arxiv.SortOrder.Descending
)

print("| 标题 | 作者 | 摘要 | 链接 |")
print("|-------|---------|---------|-----|")
for result in search.results():
# 检查论文是否在过去一周内发表
if result.published.strftime('%Y%m%d%H%M%S') > one_week_ago:
print(f"| {result.title} | {', '.join(author.name for author in result.authors)} | {result.summary} | {result.entry_id} |")
```

This script will print a markdown table of papers related to "LLM applications" that were published in the last week. The table includes the title, authors, summary, and URL of each paper.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
Executor (to chat_manager):

exitcode: 0 (execution succeeded)
Code output:
| Title | Authors | Summary | URL |
|-------|---------|---------|-----|
| Large Language Models as Data Preprocessors | Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada | Large Language Models (LLMs), typified by OpenAI's GPT series and Meta's
LLaMA variants, have marked a significant advancement in artificial
intelligence. Trained on vast amounts of text data, LLMs are capable of
understanding and generating human-like text across a diverse range of topics.
This study expands on the applications of LLMs, exploring their potential in
data preprocessing, a critical stage in data mining and analytics applications.
We delve into the applicability of state-of-the-art LLMs such as GPT-3.5,
GPT-4, and Vicuna-13B for error detection, data imputation, schema matching,
and entity matching tasks. Alongside showcasing the inherent capabilities of
LLMs, we highlight their limitations, particularly in terms of computational
expense and inefficiency. We propose an LLM-based framework for data
preprocessing, which integrates cutting-edge prompt engineering techniques,
coupled with traditional methods like contextualization and feature selection,
to improve the performance and efficiency of these models. The effectiveness of
LLMs in data preprocessing is evaluated through an experimental study spanning
12 datasets. GPT-4 emerged as a standout, achieving 100\% accuracy or F1 score
on 4 datasets, suggesting LLMs' immense potential in these tasks. Despite
certain limitations, our study underscores the promise of LLMs in this domain
and anticipates future developments to overcome current hurdles. | http://arxiv.org/abs/2308.16361v1 |
| Large language models in medicine: the potentials and pitfalls | Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou | Large language models (LLMs) have been applied to tasks in healthcare,
ranging from medical exam questions to responding to patient questions. With
increasing institutional partnerships between companies producing LLMs and
healthcare systems, real world clinical application is coming closer to
reality. As these models gain traction, it is essential for healthcare
practitioners to understand what LLMs are, their development, their current and
potential applications, and the associated pitfalls when utilized in medicine.
This review and accompanying tutorial aim to give an overview of these topics
to aid healthcare practitioners in understanding the rapidly changing landscape
of LLMs as applied to medicine. | http://arxiv.org/abs/2309.00087v1 |
| Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following | Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, Jiaming Han, Kexin Chen, Peng Gao, Xianzhi Li, Hongsheng Li, Pheng-Ann Heng | We introduce Point-Bind, a 3D multi-modality model aligning point clouds with
2D image, language, audio, and video. Guided by ImageBind, we construct a joint
embedding space between 3D and multi-modalities, enabling many promising
applications, e.g., any-to-3D generation, 3D embedding arithmetic, and 3D
open-world understanding. On top of this, we further present Point-LLM, the
first 3D large language model (LLM) following 3D multi-modal instructions. By
parameter-efficient fine-tuning techniques, Point-LLM injects the semantics of
Point-Bind into pre-trained LLMs, e.g., LLaMA, which requires no 3D instruction
data, but exhibits superior 3D and multi-modal question-answering capacity. We
hope our work may cast a light on the community for extending 3D point clouds
to multi-modality applications. Code is available at
https://github.com/ZiyuGuo99/Point-Bind_Point-LLM. | http://arxiv.org/abs/2309.00615v1 |
| Where Would I Go Next? Large Language Models as Human Mobility Predictors | Xinglei Wang, Meng Fang, Zichao Zeng, Tao Cheng | Accurate human mobility prediction underpins many important applications
across a variety of domains, including epidemic modelling, transport planning,
and emergency responses. Due to the sparsity of mobility data and the
stochastic nature of people's daily activities, achieving precise predictions
of people's locations remains a challenge. While recently developed large
language models (LLMs) have demonstrated superior performance across numerous
language-related tasks, their applicability to human mobility studies remains
unexplored. Addressing this gap, this article delves into the potential of LLMs
for human mobility prediction tasks. We introduce a novel method, LLM-Mob,
which leverages the language understanding and reasoning capabilities of LLMs
for analysing human mobility data. We present concepts of historical stays and
context stays to capture both long-term and short-term dependencies in human
movement and enable time-aware prediction by using time information of the
prediction target. Additionally, we design context-inclusive prompts that
enable LLMs to generate more accurate predictions. Comprehensive evaluations of
our method reveal that LLM-Mob excels in providing accurate and interpretable
predictions, highlighting the untapped potential of LLMs in advancing human
mobility prediction techniques. We posit that our research marks a significant
paradigm shift in human mobility modelling, transitioning from building complex
domain-specific models to harnessing general-purpose LLMs that yield accurate
predictions through language instructions. The code for this work is available
at https://github.com/xlwang233/LLM-Mob. | http://arxiv.org/abs/2308.15197v1 |
| Interactively Robot Action Planning with Uncertainty Analysis and Active Questioning by Large Language Model | Kazuki Hori, Kanata Suzuki, Tetsuya Ogata | The application of the Large Language Model (LLM) to robot action planning
has been actively studied. The instructions given to the LLM by natural
language may include ambiguity and lack of information depending on the task
context. It is possible to adjust the output of LLM by making the instruction
input more detailed; however, the design cost is high. In this paper, we
propose the interactive robot action planning method that allows the LLM to
analyze and gather missing information by asking questions to humans. The
method can minimize the design cost of generating precise robot instructions.
We demonstrated the effectiveness of our method through concrete examples in
cooking tasks. However, our experiments also revealed challenges in robot
action planning with LLM, such as asking unimportant questions and assuming
crucial information without asking. Shedding light on these issues provides
valuable insights for future research on utilizing LLM for robotics. | http://arxiv.org/abs/2308.15684v1 |
| AskIt: Unified Programming Interface for Programming with Large Language Models | Katsumi Okuda, Saman Amarasinghe | In the evolving landscape of software development, Large Language Models
(LLMs) exhibit a unique phenomenon known as emergent abilities, demonstrating
adeptness across numerous tasks, from text summarization to code generation.
While these abilities open up novel avenues in software design and crafting,
their incorporation presents substantial challenges. Developers grapple with
decisions surrounding the direct embedding of LLMs within applications versus
employing them for code generation. Moreover, effective prompt design becomes a
critical concern, given the necessity of data extraction from natural language
outputs. To address these intricacies, this paper introduces AskIt, a
domain-specific language (DSL) specifically designed for LLMs. AskIt simplifies
LLM integration, offering type-guided output control, template-based function
definitions, and a unified interface that diminishes the distinction between
LLM-based code generation and application integration. Furthermore, through
Programming by Example (PBE), AskIt harnesses the power of few-shot learning at
the programming language level. Our evaluations underscore AskIt's potency.
Across 50 tasks, AskIt generated concise prompts for the given tasks, achieving
a 16.14% reduction in prompt length relative to benchmarks. Additionally, by
enabling the transition from direct LLM application usage to function
generation, AskIt achieved significant speedups, as observed in our GSM8K
benchmark experiments. Through these advancements, AskIt streamlines the
integration of LLMs in software development, offering a more efficient,
versatile approach for leveraging emergent abilities. The implementations of
AskIt in TypeScript and Python are available at
https://github.com/katsumiok/ts-askit and https://github.com/katsumiok/pyaskit,
respectively. | http://arxiv.org/abs/2308.15645v1 |


--------------------------------------------------------------------------------
Scientist (to chat_manager):

Here are the papers related to LLM applications published in the last week:

| Domain | Title | Authors | Summary | URL |
|-------|---------|---------|-----|-----|
| Data Preprocessing | [Large Language Models as Data Preprocessors](http://arxiv.org/abs/2308.16361v1) | Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada | This study expands on the applications of LLMs, exploring their potential in data preprocessing, a critical stage in data mining and analytics applications. | [Link](http://arxiv.org/abs/2308.16361v1) |
| Medicine | [Large language models in medicine: the potentials and pitfalls](http://arxiv.org/abs/2309.00087v1) | Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou | This review and accompanying tutorial aim to give an overview of these topics to aid healthcare practitioners in understanding the rapidly changing landscape of LLMs as applied to medicine. | [Link](http://arxiv.org/abs/2309.00087v1) |
| 3D Understanding, Generation, and Instruction Following | [Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following](http://arxiv.org/abs/2309.00615v1) | Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, Jiaming Han, Kexin Chen, Peng Gao, Xianzhi Li, Hongsheng Li, Pheng-Ann Heng | We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video. | [Link](http://arxiv.org/abs/2309.00615v1) |
| Human Mobility Prediction | [Where Would I Go Next? Large Language Models as Human Mobility Predictors](http://arxiv.org/abs/2308.15197v1) | Xinglei Wang, Meng Fang, Zichao Zeng, Tao Cheng | This article delves into the potential of LLMs for human mobility prediction tasks. | [Link](http://arxiv.org/abs/2308.15197v1) |
| Robotics | [Interactively Robot Action Planning with Uncertainty Analysis and Active Questioning by Large Language Model](http://arxiv.org/abs/2308.15684v1) | Kazuki Hori, Kanata Suzuki, Tetsuya Ogata | In this paper, we propose the interactive robot action planning method that allows the LLM to analyze and gather missing information by asking questions to humans. | [Link](http://arxiv.org/abs/2308.15684v1) |
| Software Development | [AskIt: Unified Programming Interface for Programming with Large Language Models](http://arxiv.org/abs/2308.15645v1) | Katsumi Okuda, Saman Amarasinghe | This paper introduces AskIt, a domain-specific language (DSL) specifically designed for LLMs. | [Link](http://arxiv.org/abs/2308.15645v1) |

--------------------------------------------------------------------------------
Critic (to chat_manager):

The scientist has done a good job categorizing the papers into different domains and providing a summary for each. The markdown table is correctly formatted and includes the source URL for each paper, which allows for verification of the information. The domains identified are diverse, indicating a broad range of applications for Large Language Models (LLMs). This review and categorization should provide a useful resource for anyone interested in the recent applications of LLMs.

--------------------------------------------------------------------------------

创建无评论的群聊以进行比较

groupchat_nocritic = autogen.GroupChat(
agents=[user_proxy, engineer, scientist, planner, executor], messages=[], max_round=50
)
for agent in groupchat.agents:
agent.reset()
manager_nocritic = autogen.GroupChatManager(groupchat=groupchat_nocritic, llm_config=gpt4_config)
user_proxy.initiate_chat(
manager_nocritic,
message="""
从arxiv上找到最近一周内关于LLM应用的论文,并创建一个包含不同领域的markdown表格。
""",
)
管理员 (给 chat_manager):


从arxiv上找到最近一周内关于LLM应用的论文,并创建一个包含不同领域的markdown表格。


--------------------------------------------------------------------------------
规划员 (给 chat_manager):

计划:

1. 工程师: 编写一个脚本,从arXiv网站上爬取与LLM(语言模型)应用相关的论文,这些论文应该是在最近一周内发表的。脚本应该提取论文的标题、作者、摘要和链接。

2. 科学家: 检查爬取的数据,确定LLM应用的不同领域。这可以基于标题或摘要中的关键词,或者科学家对该领域的了解。

3. 工程师: 修改脚本,根据科学家确定的领域对论文进行分类。脚本应该输出一个markdown表格,包括领域、标题、作者、摘要和链接的列。

4. 科学家: 检查markdown表格,确保论文被正确分类并且信息准确无误。

5. 工程师: 根据科学家的反馈对脚本进行必要的修订。

6. 科学家: 最终批准markdown表格。

7. 工程师: 提交最终的markdown表格。

--------------------------------------------------------------------------------