修改指标中的提示¶

在使用LLM的Ragas中的每个指标都使用一个或多个提示来产生用于制定评分的中间结果。提示可以被视为使用基于LLM的指标时的超参数。一个适合您领域和用例的优化提示可以提高您的基于LLM的指标的准确性10-20%。一个最佳提示也取决于您使用的LLM，因此作为用户，您可能需要调整每个指标的提示。

Ragas中的每个提示都是使用提示对象编写的。在继续之前，请确保您已了解它。

理解您的指标的提示¶

由于Ragas将提示视为指标中的超参数，我们有一个统一的接口get_prompts来访问在任何指标下使用的提示。

In [15]:

Copied!

from ragas.metrics._simple_criteria import SimpleCriteriaScoreWithReference

scorer = SimpleCriteriaScoreWithReference(name="random", definition="some definition")
scorer.get_prompts()
from ragas.metrics._simple_criteria import SimpleCriteriaScoreWithReference

scorer = SimpleCriteriaScoreWithReference(name="random", definition="some definition")
scorer.get_prompts()

Out[15]:

{'multi_turn_prompt': <ragas.metrics._simple_criteria.MultiTurnSimpleCriteriaWithReferencePrompt at 0x7f8c41410970>,
 'single_turn_prompt': <ragas.metrics._simple_criteria.SingleTurnSimpleCriteriaWithReferencePrompt at 0x7f8c41412590>}

In [2]:

Copied!

prompts = scorer.get_prompts()
print(prompts["single_turn_prompt"].to_string())
prompts = scorer.get_prompts()
print(prompts["single_turn_prompt"].to_string())

Your task is to judge the faithfulness of a series of statements based on a given context. For each statement you must return verdict as 1 if the statement can be directly inferred based on the context or 0 if the statement can not be directly inferred based on the context.

修改默认提示中的指令¶

很可能有人会需要修改提示以满足自己的需求。Ragas 提供了 set_prompts 方法来允许您这样做。让我们更改 FactualCorrectness 指标中使用的提示之一。

In [19]:

Copied!

prompt = scorer.get_prompts()["single_turn_prompt"]
prompt.instruction += "\nOnly output valid JSON."
prompt = scorer.get_prompts()["single_turn_prompt"]
prompt.instruction += "\nOnly output valid JSON."

In [20]:

Copied!

scorer.set_prompts(**{"single_turn_prompt": prompt})
scorer.set_prompts(**{"single_turn_prompt": prompt})

请检查提示的指令是否已实际更改。

In [21]:

Copied!

print(scorer.get_prompts()["single_turn_prompt"].instruction)
print(scorer.get_prompts()["single_turn_prompt"].instruction)

Given a input, system response and reference. Evaluate and score the response against the reference only using the given criteria.
Only output valid JSON.
Only output valid JSON.

修改默认提示中的示例¶

少量示例可以极大地影响任何大型语言模型（LLM）的结果。默认提示中的示例很可能无法反映您的领域或用例。因此，用自定义示例进行修改始终是一个良好的实践。让我们在这里进行一次修改。

In [22]:

Copied!

prompt = scorer.get_prompts()["single_turn_prompt"]

prompt.examples
prompt = scorer.get_prompts()["single_turn_prompt"]

prompt.examples

Out[22]:

[(SingleTurnSimpleCriteriaWithReferenceInput(user_input='Who was the director of Los Alamos Laboratory?', response='Einstein was the director of Los Alamos Laboratory.', criteria='Score responses in range of 0 (low) to 5 (high) based similarity with reference.', reference='The director of Los Alamos Laboratory was J. Robert Oppenheimer.'),
  SimpleCriteriaOutput(reason='The response and reference have two very different answers.', score=0))]

In [23]:

Copied!





from ragas.metrics._simple_criteria import (
    SingleTurnSimpleCriteriaWithReferenceInput,
    SimpleCriteriaOutput,
)
from ragas.metrics._simple_criteria import (
    SingleTurnSimpleCriteriaWithReferenceInput,
    SimpleCriteriaOutput,
)

In [24]:

Copied!





new_example = [
    (
        SingleTurnSimpleCriteriaWithReferenceInput(
            user_input="Who was the first president of the United States?",
            response="Thomas Jefferson was the first president of the United States.",
            criteria="Score responses in range of 0 (low) to 5 (high) based similarity with reference.",
            reference="George Washington was the first president of the United States.",
        ),
        SimpleCriteriaOutput(
            reason="The response incorrectly states Thomas Jefferson instead of George Washington. While both are significant historical figures, the answer does not match the reference.",
            score=2,
        ),
    )
]
new_example = [
    (
        SingleTurnSimpleCriteriaWithReferenceInput(
            user_input="Who was the first president of the United States?",
            response="Thomas Jefferson was the first president of the United States.",
            criteria="Score responses in range of 0 (low) to 5 (high) based similarity with reference.",
            reference="George Washington was the first president of the United States.",
        ),
        SimpleCriteriaOutput(
            reason="The response incorrectly states Thomas Jefferson instead of George Washington. While both are significant historical figures, the answer does not match the reference.",
            score=2,
        ),
    )
]

In [25]:

Copied!

prompt.examples = new_example
prompt.examples = new_example

In [26]:

Copied!

scorer.set_prompts(**{"single_turn_prompt": prompt})
scorer.set_prompts(**{"single_turn_prompt": prompt})

In [27]:

Copied!

print(scorer.get_prompts()["single_turn_prompt"].examples)
print(scorer.get_prompts()["single_turn_prompt"].examples)

[(SingleTurnSimpleCriteriaWithReferenceInput(user_input='Who was the first president of the United States?', response='Thomas Jefferson was the first president of the United States.', criteria='Score responses in range of 0 (low) to 5 (high) based similarity with reference.', reference='George Washington was the first president of the United States.'), SimpleCriteriaOutput(reason='The response incorrectly states Thomas Jefferson instead of George Washington. While both are significant historical figures, the answer does not match the reference.', score=2))]

让我们现在查看并验证完整的新提示，包括修改后的指令和示例¶

In [ ]:

Copied!

scorer.get_prompts()["single_turn_prompt"].to_string()
scorer.get_prompts()["single_turn_prompt"].to_string()