通过结构化答案过滤进行优化¶

在使用我们的Refine响应合成器进行响应合成时，过滤掉非答案非常关键。经常遇到的问题是传播单个无用的响应，比如“我不知道答案”，这可能会在合成过程中持续存在，并导致最终得到相同性质的答案。即使其他更相关部分中存在实际答案，这种情况也可能发生。

可以通过将structured_answer_filtering设置为True来过滤掉这些无用的响应。默认情况下它设置为False，因为目前只有在使用支持函数调用的OpenAI模型时才能发挥最佳作用。

如果您在colab上打开这个笔记本，您可能需要安装LlamaIndex 🦙。

In [ ]:

Copied!

%pip install llama-index-llms-openai
%pip install llama-index-llms-openai

In [ ]:

Copied!

!pip install llama-index
!pip install llama-index

加载数据¶

In [ ]:

Copied!





texts = [
    "The president in the year 2040 is John Cena.",
    "The president in the year 2050 is Florence Pugh.",
    'The president in the year 2060 is Dwayne "The Rock" Johnson.',
]
texts = [
    "The president in the year 2040 is John Cena.",
    "The president in the year 2050 is Florence Pugh.",
    'The president in the year 2060 is Dwayne "The Rock" Johnson.',
]

总结¶

In [ ]:

Copied!

import os

os.environ["OPENAI_API_KEY"] = "sk-..."
import os

os.environ["OPENAI_API_KEY"] = "sk-..."

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-0613")
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-0613")

In [ ]:

Copied!

from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine", llm=llm, verbose=True
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine", llm=llm, verbose=True
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
response = summarizer.get_response("who is president in the year 2050?", texts)

> Refine context: The president in the year 2050 is Florence Pugh.
> Refine context: The president in the year 2060 is Dwayne "The R...

失败的结果¶

正如你所看到的，由于最初的“我不知道”回答一直传播到响应合成的最后，我们无法从输入的“texts”字符串中得到正确的答案。

In [ ]:

Copied!

print(response)
print(response)

I'm sorry, but I don't have access to information about the future.

现在我们将再次尝试使用 structured_answer_filtering=True。

In [ ]:

Copied!





from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=llm,
    verbose=True,
    structured_answer_filtering=True,
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=llm,
    verbose=True,
    structured_answer_filtering=True,
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
response = summarizer.get_response("who is president in the year 2050?", texts)

Function call: StructuredRefineResponse with args: {
  "answer": "It is not possible to determine who the president is in the year 2050 based on the given context information.",
  "query_satisfied": false
}
> Refine context: The president in the year 2050 is Florence Pugh.
Function call: StructuredRefineResponse with args: {
  "answer": "Florence Pugh",
  "query_satisfied": true
}
> Refine context: The president in the year 2060 is Dwayne "The R...
Function call: StructuredRefineResponse with args: {
  "answer": "Florence Pugh",
  "query_satisfied": false
}

成功的结果¶

正如你所看到的，我们能够通过过滤“texts”字符串，找到实际包含我们问题答案的字符串，从而确定了正确的答案。

In [ ]:

Copied!

print(response)
print(response)

Florence Pugh

无需函数调用的LLMs¶

您可能希望在不提供函数调用API的LLM中使用此过滤功能。

在这种情况下，Refine 模块将自动切换到使用结构化输出 Program，而不依赖于外部函数调用API。

In [ ]:

Copied!

# 我们将继续使用OpenAI，但使用一个不支持函数调用的旧模型instruct_llm = OpenAI(model="gpt-3.5-turbo-instruct")
# 我们将继续使用OpenAI，但使用一个不支持函数调用的旧模型instruct_llm = OpenAI(model="gpt-3.5-turbo-instruct")

In [ ]:

Copied!





from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)
response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)

Florence Pugh

`CompactAndRefine`¶

由于CompactAndRefine是建立在Refine之上的，因此该响应模式也支持结构化答案过滤。

In [ ]:

Copied!





from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="compact",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)
from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="compact",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)

In [ ]:

Copied!

response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)
response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)

Florence Pugh

通过结构化答案过滤进行优化¶

加载数据¶

总结¶

失败的结果¶

成功的结果¶

无需函数调用的LLMs¶

CompactAndRefine¶

`CompactAndRefine`¶