BootstrapOutOfBag:一个与scikit-learn兼容的袋外自助法版本
一种用于评估监督学习算法的袋外引导法实现。
> `from mlxtend.evaluate import BootstrapOutOfBag`
概述
最初,bootstrap方法旨在确定估计量的统计特性,当基础分布未知且没有额外样本可用时。现在,为了利用这种方法评估预测模型,例如分类和回归的假设,我们可能更喜欢使用所谓的袋外(Out-Of-Bag, OOB)或留一法(Leave-One-Out Bootstrap, LOOB)的稍微不同的bootstrapping方法。在这里,我们使用袋外样本作为评估的测试集,而不是在训练数据上评估模型。袋外样本是未用于模型拟合的唯一样本集,如下图所示[1]。
上图展示了如何从一个典型的十个样本数据集($X_1,X_2, ..., X_{10}$)中抽取三个随机的bootstrap样本及其用于测试的袋外样本。在实际操作中,Bradley Efron和Robert Tibshirani建议抽取50到200个bootstrap样本以获得可靠的估计[2]。
参考文献
- [1] https://sebastianraschka.com/blog/2016/model-evaluation-selection-part2.html
- [2] Efron, Bradley, 和 Robert J. Tibshirani. 自助法简介. CRC出版社, 1994. 数据管理(ACM SIGMOD '97),第265-276页, 1997.
示例 1 -- 评估模型的预测性能
BootstrapOutOfBag
类模拟了 scikit-learn 的交叉验证类的行为,例如 KFold
:
from mlxtend.evaluate import BootstrapOutOfBag
import numpy as np
oob = BootstrapOutOfBag(n_splits=3)
for train, test in oob.split(np.array([1, 2, 3, 4, 5])):
print(train, test)
[4 2 1 3 3] [0]
[2 4 1 2 1] [0 3]
[4 3 3 4 1] [0 2]
因此,我们可以通过 cross_val_score
方法使用 BootstrapOutOfBag
对象:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score
iris = load_iris()
X = iris.data
y = iris.target
lr = LogisticRegression()
print(cross_val_score(lr, X, y))
[ 0.96078431 0.92156863 0.95833333]
print(cross_val_score(lr, X, y, cv=BootstrapOutOfBag(n_splits=3, random_seed=456)))
[ 0.92727273 0.96226415 0.94444444]
在实践中,建议至少运行200次迭代。
print('Mean accuracy: %.1f%%' % np.mean(100*cross_val_score(
lr, X, y, cv=BootstrapOutOfBag(n_splits=200, random_seed=456))))
Mean accuracy: 94.8%
使用自助法,我们可以使用百分位数方法来计算性能估计的置信界限。我们选择下限和上限置信界限如下:
- $ACC_{lower}$ = $ACC_{boot}$分布的$\alpha_1$百分位数
- $ACC_{upper}$ = $ACC_{boot}$分布的$\alpha_2$百分位数
其中$\alpha_1 = \alpha$,$\alpha_2 = 1-\alpha$,并且计算$100 \times (1-2 \times \alpha)$置信区间的置信度。例如,要计算95%的置信区间,我们选择$\alpha=0.025$,以获得b自助样本分布的2.5百分位数和97.5百分位数作为上下置信界限。
import matplotlib.pyplot as plt
%matplotlib inline
accuracies = cross_val_score(lr, X, y, cv=BootstrapOutOfBag(n_splits=1000, random_seed=456))
mean = np.mean(accuracies)
lower = np.percentile(accuracies, 2.5)
upper = np.percentile(accuracies, 97.5)
fig, ax = plt.subplots(figsize=(8, 4))
ax.vlines(mean, [0], 40, lw=2.5, linestyle='-', label='mean')
ax.vlines(lower, [0], 15, lw=2.5, linestyle='-.', label='CI95 percentile')
ax.vlines(upper, [0], 15, lw=2.5, linestyle='-.')
ax.hist(accuracies, bins=11,
color='#0080ff', edgecolor="none",
alpha=0.3)
plt.legend(loc='upper left')
plt.show()
API
BootstrapOutOfBag(n_splits=200, random_seed=None)
Parameters
-
n_splits
: int (default=200)Number of bootstrap iterations. Must be larger than 1.
-
random_seed
: int (default=None)If int, random_seed is the seed used by the random number generator.
Returns
-
train_idx
: ndarrayThe training set indices for that split.
-
test_idx
: ndarrayThe testing set indices for that split.
Examples
For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/evaluate/BootstrapOutOfBag/
Methods
get_n_splits(X=None, y=None, groups=None)
Returns the number of splitting iterations in the cross-validator
Parameters
-
X
: objectAlways ignored, exists for compatibility with scikit-learn.
-
y
: objectAlways ignored, exists for compatibility with scikit-learn.
-
groups
: objectAlways ignored, exists for compatibility with scikit-learn.
Returns
-
n_splits
: intReturns the number of splitting iterations in the cross-validator.
split(X, y=None, groups=None)
y : array-like or None (default: None)
Argument is not used and only included as parameter
for compatibility, similar to KFold
in scikit-learn.
-
groups
: array-like or None (default: None)Argument is not used and only included as parameter for compatibility, similar to
KFold
in scikit-learn.