准确率得分：计算标准、平衡和每类的准确率

一个函数用于计算基础分类准确率、每类准确率和每类平均准确率。

> 从 mlxtend.evaluate 导入准确率评分

示例 1 -- 标准准确度

“整体”准确率定义为正确预测（真正 TP 和真负 TN）在所有样本 n 中的比例：

$$ACC = \frac{TP + TN}{n}$$

import numpy as np
from mlxtend.evaluate import accuracy_score


y_targ = [0, 0, 0, 1, 1, 1, 2, 2, 2]
y_pred = [1, 0, 0, 0, 1, 2, 0, 2, 2]

accuracy_score(y_targ, y_pred)

0.5555555555555556

示例 2 -- 按类别准确率

每个类别的准确率是一个类别（定义为 pos_label）与数据集中所有剩余数据点之间的准确率。

import numpy as np
from mlxtend.evaluate import accuracy_score


y_targ = [0, 0, 0, 1, 1, 1, 2, 2, 2]
y_pred = [1, 0, 0, 0, 1, 2, 0, 2, 2]

std_acc = accuracy_score(y_targ, y_pred)
bin_acc = accuracy_score(y_targ, y_pred, method='binary', pos_label=1)

print(f'Standard accuracy: {std_acc*100:.2f}%')
print(f'Class 1 accuracy: {bin_acc*100:.2f}%')

Standard accuracy: 55.56%
Class 1 accuracy: 66.67%

示例 3 -- 每类平均准确率

概述

“整体”准确性定义为所有样本中正确预测的数量（真正 TP 和真负 TN）与总样本数 n 的比值：

$$ACC = \frac{TP + TN}{n}$$

在二分类设置中：

在多类别设置中，我们可以将准确率的计算推广为所有真实预测（对角线）的比例与所有样本 n 之比。

$$ACC = \frac{T}{n}$$

考虑一个有 3 个类别 (C0, C1, C2) 的多类别问题

假设我们的模型做出了以下预测：

我们计算准确率为：

$$ACC = \frac{3 + 50 + 18}{90} \approx 0.79$$

现在，为了计算每类的平均准确率，我们分别计算每个类别标签的二元准确率；即，如果类别1是积极类别，则类别0和2都被视为消极类别。

$$APC\;ACC = \frac{83/90 + 71/90 + 78/90}{3} \approx 0.86$$

import numpy as np
from mlxtend.evaluate import accuracy_score


y_targ = [0, 0, 0, 1, 1, 1, 2, 0, 0]
y_pred = [1, 0, 0, 0, 1, 2, 0, 2, 1]

std_acc = accuracy_score(y_targ, y_pred)
bin_acc = accuracy_score(y_targ, y_pred, method='binary', pos_label=1)
avg_acc = accuracy_score(y_targ, y_pred, method='average')

print(f'Standard accuracy: {std_acc*100:.2f}%')
print(f'Class 1 accuracy: {bin_acc*100:.2f}%')
print(f'Average per-class accuracy: {avg_acc*100:.2f}%')

Standard accuracy: 33.33%
Class 1 accuracy: 55.56%
Average per-class accuracy: 55.56%

参考文献

[1] S. Raschka. 二元分类系统的一般性能指标概述. 计算研究存储库 (CoRR), abs/1410.5330, 2014.
[2] Cyril Goutte 和 Eric Gaussier. 精确度、召回率和F值的概率解释及其对评估的影响. 在《信息检索的进展》中，第345–359页. Springer, 2005.
[3] Brian W Matthews. T4噬菌体溶菌酶的预测和观察的二级结构比较. 生物化学和生物物理学学报 (BBA)- 蛋白质结构, 405(2):442–451, 1975.

API

accuracy_score(y_target, y_predicted, method='standard', pos_label=1, normalize=True)

General accuracy function for supervised learning. Parameters

y_target : array-like, shape=[n_values]

True class labels or target values.
y_predicted : array-like, shape=[n_values]

Predicted class labels or target values.
method : str, 'standard' by default.

The chosen method for accuracy computation. If set to 'standard', computes overall accuracy. If set to 'binary', computes accuracy for class pos_label. If set to 'average', computes average per-class (balanced) accuracy. If set to 'balanced', computes the scikit-learn-style balanced accuracy.
pos_label : str or int, 1 by default.

The class whose accuracy score is to be reported. Used only when method is set to 'binary'
normalize : bool, True by default.

If True, returns fraction of correctly classified samples. If False, returns number of correctly classified samples.

Returns

score: float

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/evaluate/accuracy_score/