使用 Jest 和 Vitest 测试提 示
promptfoo
可以与 Jest 和 Vitest 等测试框架集成,以评估提示作为现有测试和 CI 工作流程的一部分。
本指南包括示例,展示如何使用语义相似性和 LLM 评分创建所需的提示质量测试用例。您也可以跳转到 完整示例代码。
有关支持的检查的更多信息,请参阅 预期输出文档。
前提条件
在开始之前,请确保已安装以下节点包:
- jest:
npm install --save-dev jest
- vitest:
npm install --save-dev vitest
- promptfoo:
npm install --save-dev promptfoo
创建自定义匹配器
首先,我们将创建自定义匹配器:
toMatchSemanticSimilarity
: 比较两个字符串的语义相似性。toPassLLMRubric
: 检查字符串是否符合指定的 LLM 评分标准。toMatchFactuality
: 检查字符串是否符合指定的真实性标准。toMatchClosedQA
: 检查字 符串是否符合指定的问答标准。
创建一个名为 matchers.js
的新文件,并添加以下内容:
- Javascript
- Typescript
import { assertions } from 'promptfoo';
const { matchesSimilarity, matchesLlmRubric } = assertions;
export function installMatchers() {
expect.extend({
async toMatchSemanticSimilarity(received, expected, threshold = 0.8) {
const result = await matchesSimilarity(received, expected, threshold);
const pass = received === expected || result.pass;
if (pass) {
return {
message: () => `expected ${received} not to match semantic similarity with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to match semantic similarity with ${expected}, but it did not. Reason: ${result.reason}`,
pass: false,
};
}
},
async toPassLLMRubric(received, expected, gradingConfig) {
const gradingResult = await matchesLlmRubric(expected, received, gradingConfig);
if (gradingResult.pass) {
return {
message: () => `expected ${received} not to pass LLM Rubric with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to pass LLM Rubric with ${expected}, but it did not. Reason: ${gradingResult.reason}`,
pass: false,
};
}
},
async toMatchFactuality(input, expected, received, gradingConfig) {
const gradingResult = await matchesFactuality(input, expected, received, gradingConfig);
if (gradingResult.pass) {
return {
message: () => `expected ${received} not to match factuality with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to match factuality with ${expected}, but it did not. Reason: ${gradingResult.reason}`,
pass: false,
};
}
},
async toMatchClosedQA(input, expected, received, gradingConfig) {
const gradingResult = await matchesClosedQa(input, expected, received, gradingConfig);
if (gradingResult.pass) {
return {
message: () => `expected ${received} not to match ClosedQA with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to match ClosedQA with ${expected}, but it did not. Reason: ${gradingResult.reason}`,
pass: false,
};
}
},
});
}
import { assertions } from 'promptfoo';
import type { GradingConfig } from 'promptfoo';
const { matchesSimilarity, matchesLlmRubric } = assertions;
declare global {
namespace jest {
interface Matchers<R> {
toMatchSemanticSimilarity(expected: string, threshold?: number): R;
toPassLLMRubric(expected: string, gradingConfig: GradingConfig): R;
}
}
}
export function installMatchers() {
expect.extend({
async toMatchSemanticSimilarity(
received: string,
expected: string,
threshold: number = 0.8,
): Promise<jest.CustomMatcherResult> {
const result = await matchesSimilarity(received, expected, threshold);
const pass = received === expected || result.pass;
if (pass) {
return {
message: () => `期望 ${received} 与 ${expected} 不匹配语义相似性`,
pass: true,
};
} else {
return {
message: () =>
`期望 ${received} 与 ${expected} 匹配语义相似性,但未匹配。原因: ${result.reason}`,
pass: false,
};
}
},
async toPassLLMRubric(
received: string,
expected: string,
gradingConfig: GradingConfig,
): Promise<jest.CustomMatcherResult> {
const gradingResult = await matchesLlmRubric(expected, received, gradingConfig);
if (gradingResult.pass) {
return {
message: () => `期望 ${received} 不通过 LLM 评分标准与 ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`期望 ${received} 通过 LLM 评分标准与 ${expected},但未通过。原因: ${gradingResult.reason}`,
pass: false,
};
}
},
});
}
编写测试
我们的测试代码将使用自定义匹配器来运行几个测试用例。
创建一个名为 index.test.js
的新文件,并添加以下代码:
import { installMatchers } from './matchers';
installMatchers();
const gradingConfig = {
provider: 'openai:chat:gpt-4o-mini',
};
describe('语义相似性测试', () => {
test('当字符串语义相似时应通过', async () => {
await expect('The quick brown fox').toMatchSemanticSimilarity('A fast brown fox');
});
test('当字符串语义不相似时应失败', async () => {
await expect('The quick brown fox').not.toMatchSemanticSimilarity('The weather is nice today');
});
test('当字符串语义相似且使用自定义阈值时应通过', async () => {
await expect('The quick brown fox').toMatchSemanticSimilarity('A fast brown fox', 0.7);
});
test('当字符串语义不相似且使用自定义阈值时应失败', async () => {
await expect('The quick brown fox').not.toMatchSemanticSimilarity(
'The weather is nice today',
0.9,
);
});
});
describe('LLM 评估测试', () => {
test('当字符串符合 LLM 评分标准时应通过', async () => {
await expect('Four score and seven years ago').toPassLLMRubric(
'包含著名演讲的一部分',
gradingConfig,
);
});
test('当字符串不符合 LLM 评分标准时应失败', async () => {
await expect('It is time to do laundry').not.toPassLLMRubric(
'包含著名演讲的一部分',
gradingConfig,
);
});
});
最终设置
将以下行添加到 package.json
的 scripts
部分:
"test": "jest"
现在,您可以使用以下命令运行测试:
npm test
这将执行测试并在终端中显示结果。
请注意,如果您使用的是默认提供程序,则需要设置 OPENAI_API_KEY
环境变量。