Skip to main content

使用 Jest 和 Vitest 测试提示

promptfoo 可以与 JestVitest 等测试框架集成,以评估提示作为现有测试和 CI 工作流程的一部分。

本指南包括示例,展示如何使用语义相似性和 LLM 评分创建所需的提示质量测试用例。您也可以跳转到 完整示例代码

有关支持的检查的更多信息,请参阅 预期输出文档

前提条件

在开始之前,请确保已安装以下节点包:

  • jest: npm install --save-dev jest
  • vitest: npm install --save-dev vitest
  • promptfoo: npm install --save-dev promptfoo

创建自定义匹配器

首先,我们将创建自定义匹配器:

  • toMatchSemanticSimilarity: 比较两个字符串的语义相似性。
  • toPassLLMRubric: 检查字符串是否符合指定的 LLM 评分标准。
  • toMatchFactuality: 检查字符串是否符合指定的真实性标准。
  • toMatchClosedQA: 检查字符串是否符合指定的问答标准。

创建一个名为 matchers.js 的新文件,并添加以下内容:

import { assertions } from 'promptfoo';

const { matchesSimilarity, matchesLlmRubric } = assertions;

export function installMatchers() {
expect.extend({
async toMatchSemanticSimilarity(received, expected, threshold = 0.8) {
const result = await matchesSimilarity(received, expected, threshold);
const pass = received === expected || result.pass;
if (pass) {
return {
message: () => `expected ${received} not to match semantic similarity with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to match semantic similarity with ${expected}, but it did not. Reason: ${result.reason}`,
pass: false,
};
}
},

async toPassLLMRubric(received, expected, gradingConfig) {
const gradingResult = await matchesLlmRubric(expected, received, gradingConfig);
if (gradingResult.pass) {
return {
message: () => `expected ${received} not to pass LLM Rubric with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to pass LLM Rubric with ${expected}, but it did not. Reason: ${gradingResult.reason}`,
pass: false,
};
}
},

async toMatchFactuality(input, expected, received, gradingConfig) {
const gradingResult = await matchesFactuality(input, expected, received, gradingConfig);
if (gradingResult.pass) {
return {
message: () => `expected ${received} not to match factuality with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to match factuality with ${expected}, but it did not. Reason: ${gradingResult.reason}`,
pass: false,
};
}
},

async toMatchClosedQA(input, expected, received, gradingConfig) {
const gradingResult = await matchesClosedQa(input, expected, received, gradingConfig);
if (gradingResult.pass) {
return {
message: () => `expected ${received} not to match ClosedQA with ${expected}`,
pass: true,
};
} else {
return {
message: () =>
`expected ${received} to match ClosedQA with ${expected}, but it did not. Reason: ${gradingResult.reason}`,
pass: false,
};
}
},
});
}

编写测试

我们的测试代码将使用自定义匹配器来运行几个测试用例。

创建一个名为 index.test.js 的新文件,并添加以下代码:

import { installMatchers } from './matchers';

installMatchers();

const gradingConfig = {
provider: 'openai:chat:gpt-4o-mini',
};

describe('语义相似性测试', () => {
test('当字符串语义相似时应通过', async () => {
await expect('The quick brown fox').toMatchSemanticSimilarity('A fast brown fox');
});

test('当字符串语义不相似时应失败', async () => {
await expect('The quick brown fox').not.toMatchSemanticSimilarity('The weather is nice today');
});

test('当字符串语义相似且使用自定义阈值时应通过', async () => {
await expect('The quick brown fox').toMatchSemanticSimilarity('A fast brown fox', 0.7);
});

test('当字符串语义不相似且使用自定义阈值时应失败', async () => {
await expect('The quick brown fox').not.toMatchSemanticSimilarity(
'The weather is nice today',
0.9,
);
});
});

describe('LLM 评估测试', () => {
test('当字符串符合 LLM 评分标准时应通过', async () => {
await expect('Four score and seven years ago').toPassLLMRubric(
'包含著名演讲的一部分',
gradingConfig,
);
});

test('当字符串不符合 LLM 评分标准时应失败', async () => {
await expect('It is time to do laundry').not.toPassLLMRubric(
'包含著名演讲的一部分',
gradingConfig,
);
});
});

最终设置

将以下行添加到 package.jsonscripts 部分:

"test": "jest"

现在,您可以使用以下命令运行测试:

npm test

这将执行测试并在终端中显示结果。

请注意,如果您使用的是默认提供程序,则需要设置 OPENAI_API_KEY 环境变量。