使用 Mocha/Chai 测试提示
promptfoo
可以与测试框架(如 Mocha)和断言库(如 Chai)集成,以便在现有的测试和 CI 工作流程中评估提示。
本指南包含示例,展示了如何使用语义相似性和 LLM 评分创建 Mocha 测试用例以实现所需的提示质量。
有关支持的检查的更多信息,请参阅 预期输出文档。
前提条件
在开始之前,请确保已安装以下 node 包:
- mocha:
npm install --save-dev mocha
- chai:
npm install --save-dev chai
- promptfoo:
npm install --save-dev promptfoo
创建自定义 chai 断言
首先,我们将创建自定义 chai 断言:
toMatchSemanticSimilarity
: 比较两个字符串的语义相似性。toPassLLMRubric
: 检查字符串是否符合指定的 LLM Rubric 标准。toMatchFactuality
: 检查字符串是否符合指定的真实性标准。toMatchClosedQA
: 检查字符串是否符合指定的问答标准。
创建一个名为 assertions.js
的新文件并 添加以下内容:
- Javascript
- Typescript
import { Assertion } from 'chai';
import { assertions } from 'promptfoo';
const { matchesSimilarity, matchesLlmRubric } = assertions;
Assertion.addAsyncMethod('toMatchSemanticSimilarity', async function (expected, threshold = 0.8) {
const received = this._obj;
const result = await matchesSimilarity(received, expected, threshold);
const pass = received === expected || result.pass;
this.assert(
pass,
`expected #{this} to match semantic similarity with #{exp}, but it did not. Reason: ${result.reason}`,
`expected #{this} not to match semantic similarity with #{exp}`,
expected,
);
});
Assertion.addAsyncMethod('toPassLLMRubric', async function (expected, gradingConfig) {
const received = this._obj;
const gradingResult = await matchesLlmRubric(expected, received, gradingConfig);
this.assert(
gradingResult.pass,
`expected #{this} to pass LLM Rubric with #{exp}, but it did not. Reason: ${gradingResult.reason}`,
`expected #{this} not to pass LLM Rubric with #{exp}`,
expected,
);
});
Assertion.addAsyncMethod('toMatchFactuality', async function (input, expected, gradingConfig) {
const received = this._obj;
const gradingResult = await matchesFactuality(input, expected, received, gradingConfig);
this.assert(
gradingResult.pass,
`expected #{this} to match factuality with #{exp}, but it did not. Reason: ${gradingResult.reason}`,
`expected #{this} not to match factuality with #{exp}`,
expected,
);
});
Assertion.addAsyncMethod('toMatchClosedQA', async function (input, expected, gradingConfig) {
const received = this._obj;
const gradingResult = await matchesClosedQa(input, expected, received, gradingConfig);
this.assert(
gradingResult.pass,
`expected #{this} to match ClosedQA with #{exp}, but it did not. Reason: ${gradingResult.reason}`,
`expected #{this} not to match ClosedQA with #{exp}`,
expected,
);
});
import { Assertion } from 'chai';
import { assertions } from 'promptfoo';
import type { GradingConfig } from 'promptfoo';
const { matchesSimilarity, matchesLlmRubric } = assertions;
Assertion.addAsyncMethod(
'toMatchSemanticSimilarity',
async function (this: Assertion, expected: string, threshold: number = 0.8) {
const received = this._obj;
const result = await matchesSimilarity(received, expected, threshold);
const pass = received === expected || result.pass;
this.assert(
pass,
`expected #{this} to match semantic similarity with #{exp}, but it did not. Reason: ${result.reason}`,
`expected #{this} not to match semantic similarity with #{exp}`,
expected,
);
},
);
Assertion.addAsyncMethod(
'toPassLLMRubric',
async function (this: Assertion, expected: string, gradingConfig: GradingConfig) {
const received = this._obj;
const gradingResult = await matchesLlmRubric(expected, received, gradingConfig);
this.assert(
gradingResult.pass,
`expected #{this} to pass LLM Rubric with #{exp}, but it did not. Reason: ${gradingResult.reason}`,
`expected #{this} not to pass LLM Rubric with #{exp}`,
expected,
);
},
);
编写测试
我们的测试代码将使用自定义的 chai 断言来运行几个测试用例。
创建一个名为 index.test.js
的新文件并添加以下代码:
import { expect } from 'chai';
import './assertions';
const gradingConfig = {
provider: 'openai:chat:gpt-4o-mini',
};
describe('语义相似性测试', () => {
it('当字符串语义相似时应通过', async () => {
await expect('敏捷的棕色狐狸').toMatchSemanticSimilarity('快速的棕色狐狸');
});
it('当字符串语义不相似时应失败', async () => {
await expect('敏捷的棕色狐狸').not.toMatchSemanticSimilarity('今天 天气很好');
});
it('当字符串语义相似且使用自定义阈值时应通过', async () => {
await expect('敏捷的棕色狐狸').toMatchSemanticSimilarity('快速的棕色狐狸', 0.7);
});
it('当字符串语义不相似且使用自定义阈值时应失败', async () => {
await expect('敏捷的棕色狐狸').not.toMatchSemanticSimilarity(
'今天天气很好',
0.9,
);
});
});
describe('LLM评估测试', () => {
it('当字符串符合LLM评分标准时应通过', async () => {
await expect('四年前').toPassLLMRubric(
'包含一段著名演讲的部分内容',
gradingConfig,
);
});
it('当字符串不符合LLM评分标准时应失败', async () => {
await expect('是时候洗衣服了').not.toPassLLMRubric(
'包含一段著名演讲的部分内容',
gradingConfig,
);
});
});
最终设置
将以下行添加到 package.json
的 scripts
部分:
"test": "mocha"
现在,您可以使用以下命令运行测试:
npm test
这将执行测试并在终端中显示结果。
请注意,如果您使用的是默认提供程序,则需要设置 OPENAI_API_KEY
环境变量。