使用 Node 包

安装

promptfoo 作为一个 Node 包在 npm 上可用：

npm install promptfoo

使用

通过导入 evaluate 函数，将 promptfoo 用作项目中的库：

import promptfoo from 'promptfoo';

const results = await promptfoo.evaluate(testSuite, options);

evaluate 函数接受以下参数：

testSuite：与 promptfooconfig.yaml 等效的 JavaScript 对象，作为 TestSuiteConfiguration 对象。
options：与测试工具运行方式相关的杂项选项，作为 EvaluateOptions 对象。

评估结果作为 EvaluateSummary 对象返回。

提供者函数

ProviderFunction 是一个实现 LLM API 调用的 JavaScript 函数。它接受一个提示字符串和一个上下文。它返回 LLM 响应或错误。请参阅 ProviderFunction 类型。

断言函数

Assertion 可以将其 value 作为 AssertionFunction。AssertionFunction 参数：

output：LLM 输出
testCase：测试用例
assertion：断言对象

类型定义

type AssertionFunction = (
  output: string,
  testCase: AtomicTestCase,
  assertion: Assertion,
) => Promise<GradingResult>;

interface GradingResult {
// 测试通过或失败
pass: boolean;

// 测试分数，通常在 0 到 1 之间
score: number;

// 结果的纯文本原因
reason: string;

// 标记为指标的值映射
namedScores?: Record<string, number>;

// 此断言的令牌使用记录
tokensUsed?: Partial<{
total: number;
prompt: number;
completion: number;
cached?: number;
}>;

// 断言的每个组件的结果列表
componentResults?: GradingResult[];

// 被评估的断言
assertion: Assertion | null;
}

有关不同断言类型的更多信息，请参阅断言与指标。

示例

promptfoo 导出一个 evaluate 函数，您可以使用它来运行提示评估。

import promptfoo from 'promptfoo';

const results = await promptfoo.evaluate(
  {
    prompts: ['Rephrase this in French: {{body}}', 'Rephrase this like a pirate: {{body}}'],
    providers: ['openai:gpt-4o-mini'],
    tests: [
      {
        vars: {
          body: 'Hello world',
        },
      },
      {
        vars: {
          body: "I'm hungry",
        },
      },
    ],
    writeLatestResults: true, // 将结果写入磁盘，以便在网页查看器中查看
  },
  {
    maxConcurrency: 2,
  },
);

console.log(results);

此代码导入了 promptfoo 库，定义了评估选项，然后使用这些选项调用 evaluate 函数。

您还可以将函数作为 prompts、providers 或 asserts 提供：

import promptfoo from 'promptfoo';

(async () => {
  const results = await promptfoo.evaluate({
    prompts: [
      'Rephrase this in French: {{body}}',
      (vars) => {
        return `Rephrase this like a pirate: ${vars.body}`;
      },
    ],
    providers: [
      'openai:gpt-4o-mini',
      (prompt, context) => {
        // 在此处调用 LLM...
        console.log(`Prompt: ${prompt}, vars: ${JSON.stringify(context.vars)}`);
        return {
          output: '<LLM output>',
        };
      },
    ],
    tests: [
      {
        vars: {
          body: 'Hello world',
        },
      },
      {
        vars: {
          body: "I'm hungry",
        },
        assert: [
          {
            type: 'javascript',
            value: (output) => {
              const pass = output.includes("J'ai faim");
              return {
                pass,
                score: pass ? 1.0 : 0.0,
                reason: pass ? 'Output contained substring' : 'Output did not contain substring',
              };
            },
          },
        ],
      },
    ],
  });
  console.log('RESULTS:');
  console.log(results);
})();

Github 上有一个完整的示例在此。

以下是 JSON 格式的示例输出：

{
  "results": [
    {
      "prompt": {
        "raw": "Rephrase this in French: Hello world",
        "display": "Rephrase this in French: {{body}}"
      },
      "vars": {
        "body": "Hello world"
      },
      "response": {
        "output": "Bonjour le monde",
        "tokenUsage": {
          "total": 19,
          "prompt": 16,
          "completion": 3
        }
      }
    },
    {
      "prompt": {
        "raw": "Rephrase this in French: I&#39;m hungry",
        "display": "Rephrase this in French: {{body}}"
      },
      "vars": {
        "body": "I'm hungry"
      },
      "response": {
        "output": "J'ai faim.",
        "tokenUsage": {
          "total": 24,
          "prompt": 19,
          "completion": 5
        }
      }
    }
    // ...
  ],
  "stats": {
    "successes": 4,
    "failures": 0,
    "tokenUsage": {
      "total": 120,
      "prompt": 72,
      "completion": 48
    }
  },
  "table": [
    ["Rephrase this in French: {{body}}", "Rephrase this like a pirate: {{body}}", "body"],
    ["Bonjour le monde", "Ahoy thar, me hearties! Avast ye, world!", "Hello world"],
    [
      "J'ai faim.",
      "Arrr, me belly be empty and me throat be parched! I be needin' some grub, matey!",
      "I'm hungry"
    ]
  ]
}

人工智能与机器学习

概述

人工智能（AI）和机器学习（ML）是当今科技领域最热门的话题之一。AI是指计算机系统能够执行通常需要人类智能的任务，如视觉识别、语音识别和决策制定。ML是AI的一个子集，专注于开发能够从数据中学习的算法。

关键概念

监督学习

监督学习是一种ML技术，其中算法通过标记的训练数据进行学习。目标是使模型能够对新数据进行准确的预测。

无监督学习

无监督学习涉及从未标记的数据中发现隐藏的模式或数据分组。聚类和降维是无监督学习的常见应用。

强化学习

强化学习是一种ML方法，其中智能体通过与环境的交互来学习最佳行为。奖励和惩罚机制用于指导学习过程。

应用

医疗保健

AI和ML在医疗保健领域有广泛应用，包括疾病诊断、药物发现和个性化治疗。

金融

在金融领域，AI用于欺诈检测、算法交易和风险管理。

自动驾驶

自动驾驶汽车依赖于AI和ML来感知环境、做出决策并控制车辆。

挑战

数据隐私

处理敏感数据时，确保数据隐私和安全是一个重大挑战。

伦理问题

AI系统的决策过程缺乏透明度，引发了关于偏见和公平性的伦理问题。

计算资源

训练复杂的ML模型需要大量的计算资源，这可能限制了其广泛应用。

未来展望

AI和ML的未来充满希望，预计将在各个行业带来革命性的变化。持续的研究和技术进步将解决当前的挑战，推动这些技术的发展。

安装​

使用​

提供者函数​

断言函数​

示例​

人工智能与机器学习

概述​

关键概念​

监督学习​

无监督学习​

强化学习​

应用​

医疗保健​

金融​

自动驾驶​

挑战​

数据隐私​

伦理问题​

计算资源​

未来展望​

安装

使用