Mlxtend.regressor

mlxtend version: 0.23.1

LinearRegression

LinearRegression(method='direct', eta=0.01, epochs=50, minibatches=None, random_seed=None, print_progress=0)

普通最小二乘线性回归.

Parameters

method : string (默认: 'direct')

对于基于梯度下降的优化,使用 sgd（参见 minibatch 参数以获取更多选项）.否则,如果使用 direct（默认）,则使用解析方法.对于替代的数值上更稳定的解决方案,使用 qr（QR 分解）或 svd（奇异值分解）.
eta : float (默认: 0.01)

求解器学习率（介于 0.0 和 1.0 之间）.与 method = 'sgd' 一起使用.（参见 methods 参数以获取详细信息）
epochs : int (默认: 50)

训练数据集的遍历次数. 在每次遍历之前,如果 minibatches > 1,数据集会被打乱以防止随机梯度下降中的循环.与 method = 'sgd' 一起使用.（参见 methods 参数以获取详细信息）
minibatches : int (默认: None)

基于梯度优化的最小批次数量. 如果为 None:直接方法、QR 或 SVD 方法（参见 method 参数以获取详细信息）如果为 1:梯度下降学习如果为 len(y):随机梯度下降学习如果 1 < minibatches < len(y):小批次学习
random_seed : int (默认: None)

设置随机状态以用于打乱和初始化权重.与 method = 'sgd' 一起使用.（参见 methods 参数以获取详细信息）
print_progress : int (默认: 0)

如果 method = 'sgd',打印拟合进度到标准错误. 0: 无输出 1: 已用轮数和成本 2: 1 加上已用时间 3: 2 加上预计完成时间

Attributes

w_ : 2d-array, shape={n_features, 1}

拟合后的模型权重.
b_ : 1d-array, shape={1,}

拟合后的偏置单元.
cost_ : list

每轮后的平方误差和; 如果求解器为 'normal equation',则忽略此项

Examples

有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/regressor/LinearRegression/

Methods

fit(X, y, init_params=True)

学习训练数据中的模型.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

训练向量,其中 n_samples 是样本数量,n_features 是特征数量.
y : array-like, shape = [n_samples]

目标值.
init_params : bool (default: True)

在拟合之前重新初始化模型参数. 设置为 False 以继续使用先前模型拟合的权重进行训练.

Returns

self : object

get_params(deep=True)

获取此估计器的参数.

Parameters

deep : boolean, 可选

如果为 True,将返回此估计器及其包含的作为估计器的子对象的参数.

Returns

params : 字符串到任意类型的映射

参数名称映射到其值.

改编自 https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/base.py 作者: Gael Varoquaux gael.varoquaux@normalesup.org 许可证: BSD 3 条款

predict(X)

预测目标值.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

训练向量,其中 n_samples 是样本数量,n_features 是特征数量.

Returns

target_values : array-like, shape = [n_samples]

预测的目标值.

set_params(params)

设置此估计器的参数. 该方法适用于简单估计器以及嵌套对象（如管道）. 后者具有形式为<组件>__<参数>的参数,以便可以更新嵌套对象的每个组件.

Returns

self

改编自
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/base.py
作者: Gael Varoquaux <gael.varoquaux@normalesup.org>
许可证: BSD 3条款

StackingCVRegressor

StackingCVRegressor(regressors, meta_regressor, cv=5, shuffle=True, random_state=None, verbose=0, refit=True, use_features_in_secondary=False, store_train_meta_features=False, n_jobs=None, pre_dispatch='2n_jobs', multi_output=False)*

scikit-learn估计器的'Stacking Cross-Validation'回归器.

Parameters

regressors : array-like, shape = [n_regressors]

回归器列表. 调用StackingCVRegressor的fit方法将拟合这些原始回归器的克隆, 这些克隆将存储在类属性self.regr_中.
meta_regressor : object

用于拟合回归器集合的元回归器
cv : int, 交叉验证生成器或可迭代对象, 可选 (默认: 5)

确定交叉验证的分割策略. cv的可能输入包括: - None,使用默认的5折交叉验证, - 整数,指定KFold中的折数, - 用作交叉验证生成器的对象. - 产生训练、测试分割的可迭代对象. 对于整数/None输入,将使用KFold交叉验证
shuffle : bool (默认: True)

如果为True,并且cv参数为整数,训练数据将在交叉验证之前进行洗牌. 如果cv参数是特定的交叉验证技术,则忽略此参数.
random_state : int, RandomState实例或None, 可选 (默认: None)

控制cv分割器的随机性.当cv为整数且shuffle=True时使用. 在v0.16.0中新增.
verbose : int, 可选 (默认=0)

控制构建过程的详细程度.在v0.16.0中新增.
refit : bool (默认: True)

如果为True（默认）,则为堆叠回归克隆回归器, 否则使用原始回归器,这些回归器将在调用fit方法时在数据集上重新拟合. 如果你使用的是支持scikit-learn fit/predict API接口但不兼容scikit-learn的clone函数的估计器, 建议设置refit=False.
use_features_in_secondary : bool (默认: False)

如果为True,元回归器将在原始回归器的预测和原始数据集上进行训练. 如果为False,元回归器将仅在原始回归器的预测上进行训练.
store_train_meta_features : bool (默认: False)

如果为True,从训练数据计算的元特征将用于拟合元回归器, 并存储在self.train_meta_features_数组中, 该数组可以在调用fit后访问.
n_jobs : int 或 None, 可选 (默认=None)

用于计算的CPU数量. None表示1,除非在:obj:joblib.parallel_backend上下文中. -1表示使用所有处理器.有关更多详细信息,请参阅:term:Glossary <n_jobs>. 在v0.16.0中新增.
pre_dispatch : int, 或字符串, 可选

控制并行执行期间分派的作业数量.减少此数量可能有助于避免在分派的作业多于CPU可以处理时内存消耗的爆炸.此参数可以是: - None,在这种情况下,所有作业都会立即创建并生成. 对于轻量级和快速运行的作业,使用此选项可避免因按需生成作业而导致的延迟 - 一个整数,给出要生成的总作业的确切数量 - 一个字符串,给出作为n_jobs函数的表达式,如'2*n_jobs'
multi_output : bool (默认: False)

如果为True,允许多输出目标,但禁止nan或inf值. 如果为False,将检查y是否为向量.(在v0.19.0中新增.)

Attributes

train_meta_features : numpy array, shape = [n_samples, n_regressors]

训练数据的元特征,其中n_samples是训练数据中的样本数量, len(self.regressors)是回归器的数量.

Examples

有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/regressor/StackingCVRegressor/

Methods

fit(X, y, groups=None, sample_weight=None)

拟合集成回归器和元回归器.

Parameters

X : numpy数组, shape = [n_samples, n_features]

训练向量,其中n_samples是样本数量,n_features是特征数量.
y : numpy数组, shape = [n_samples] 或 [n_samples, n_targets]

目标值.仅当self.multi_output为True时支持多目标.
groups : numpy数组/None, shape = [n_samples]

每个样本所属的组.特定折叠策略(如GroupKFold())会使用此参数.
sample_weight : 类数组, shape = [n_samples], 可选

作为sample_weights传递给回归器列表中的每个回归器以及meta_regressor的样本权重. 如果某些回归器在fit()方法中不支持sample_weight,则会引发错误.

Returns

self : 对象

fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to `X` and `y` with optional parameters `fit_params`
and returns a transformed version of `X`.

Parameters

X : array-like of shape (n_samples, n_features)

Input samples.
y : array-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).
**fit_params : dict

Additional fit parameters.

Returns

X_new : ndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()

Get metadata routing of this object.

Please check :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.

Returns

routing : MetadataRequest

A :class:~sklearn.utils.metadata_routing.MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep : bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params : dict

Parameter names mapped to their values.

predict(X)

预测目标值.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

训练向量,其中 n_samples 是样本数量,n_features 是特征数量.

Returns

y_target : array-like, shape = [n_samples] 或 [n_samples, n_targets]

预测的目标值.

predict_meta_features(X)

获取测试数据的元特征.

Parameters

X : numpy数组, shape = [n_samples, n_features]

测试向量,其中n_samples是样本数量, n_features是特征数量.

Returns

meta-features : numpy数组, shape = [n_samples, len(self.regressors)]

测试数据的元特征,其中n_samples是测试数据中的样本数量, len(self.regressors)是回归器的数量.如果self.multi_output为True, 则列数为len(self.regressors) * n_targets.

score(X, y, sample_weight=None)

Return the coefficient of determination of the prediction.

The coefficient of determination :math:`R^2` is defined as
:math:`(1 - \frac{u}{v})`, where :math:`u` is the residual

sum of squares ((y_true - y_pred)** 2).sum() and :math:v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum().

The best possible score is 1.0 and it can be negative (because the

model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a :math:R^2 score of 0.0.

Parameters

X : array-like of shape (n_samples, n_features)

Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y : array-like of shape (n_samples,) or (n_samples, n_outputs)

True values for X.
sample_weight : array-like of shape (n_samples,), default=None

Sample weights.

Returns

score : float

:math:R^2 of self.predict(X) w.r.t. y.

Notes

The :math:R^2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).

set_fit_request(self: mlxtend.regressor.stacking_cv_regression.StackingCVRegressor, , groups: Union[bool, NoneType, str] = '$UNCHANGED$', sample_weight: Union[bool, NoneType, str] = '$UNCHANGED$') -> mlxtend.regressor.stacking_cv_regression.StackingCVRegressor*

Request metadata passed to the fit method.

Note that this method is only relevant if
``enable_metadata_routing=True`` (see :func:`sklearn.set_config`).
Please see :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.

The options for each parameter are:

- ``True``: metadata is requested, and passed to ``fit`` if provided. The request is ignored if metadata is not provided.

- ``False``: metadata is not requested and the meta-estimator will not pass it to ``fit``.

- ``None``: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

- ``str``: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (``sklearn.utils.metadata_routing.UNCHANGED``) retains the
existing request. This allows you to change the request for some
parameters and not others.

.. versionadded:: 1.3

.. note::
This method is only relevant if this estimator is used as a
sub-estimator of a meta-estimator, e.g. used inside a
:class:`~sklearn.pipeline.Pipeline`. Otherwise it has no effect.

Parameters

groups : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for groups parameter in fit.
sample_weight : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

Returns

self : object

The updated object.

set_output(, transform=None)*

Set output container.

See :ref:`sphx_glr_auto_examples_miscellaneous_plot_set_output.py`
for an example on how to use the API.

Parameters

transform : {"default", "pandas", "polars"}, default=None

Configure output of transform and fit_transform.
- "default": Default output format of a transformer
- "pandas": DataFrame output
- "polars": Polars output
- None: Transform configuration is unchanged
.. versionadded:: 1.4 "polars" option was added.

Returns

self : estimator instance

Estimator instance.

set_params(params)

设置此估计器的参数.

有效的参数键可以通过 ``get_params()`` 列出.

Returns

self

set_score_request(self: mlxtend.regressor.stacking_cv_regression.StackingCVRegressor, , sample_weight: Union[bool, NoneType, str] = '$UNCHANGED$') -> mlxtend.regressor.stacking_cv_regression.StackingCVRegressor*

Request metadata passed to the score method.

Note that this method is only relevant if
``enable_metadata_routing=True`` (see :func:`sklearn.set_config`).
Please see :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.

The options for each parameter are:

- ``True``: metadata is requested, and passed to ``score`` if provided. The request is ignored if metadata is not provided.

- ``False``: metadata is not requested and the meta-estimator will not pass it to ``score``.

- ``None``: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

- ``str``: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (``sklearn.utils.metadata_routing.UNCHANGED``) retains the
existing request. This allows you to change the request for some
parameters and not others.

.. versionadded:: 1.3

.. note::
This method is only relevant if this estimator is used as a
sub-estimator of a meta-estimator, e.g. used inside a
:class:`~sklearn.pipeline.Pipeline`. Otherwise it has no effect.

Parameters

sample_weight : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns

self : object

The updated object.

Properties

named_regressors

Returns

命名估计器元组列表,例如 [('svc', SVC(...))]

StackingRegressor

StackingRegressor(regressors, meta_regressor, verbose=0, use_features_in_secondary=False, store_train_meta_features=False, refit=True, multi_output=False)

一个用于回归的scikit-learn估计器的堆叠回归器.

Parameters

regressors : array-like, shape = [n_regressors]

回归器列表. 调用StackingRegressor的fit方法将拟合这些原始回归器的克隆, 这些克隆将存储在类属性self.regr_中.
meta_regressor : object

用于拟合回归器集合的元回归器
verbose : int, 可选 (默认=0)

控制构建过程的详细程度. - verbose=0 (默认): 不打印任何内容 - verbose=1: 打印正在拟合的回归器的数量和名称 - verbose=2: 打印正在拟合的回归器的参数信息 - verbose>2: 将底层回归器的verbose参数设置为self.verbose - 2
use_features_in_secondary : bool (默认: False)

如果为True,元回归器将在原始回归器的预测结果和原始数据集上进行训练. 如果为False,元回归器将仅在原始回归器的预测结果上进行训练.
store_train_meta_features : bool (默认: False)

如果为True,从训练数据中计算的用于拟合元回归器的元特征将存储在self.train_meta_features_数组中, 该数组可以在调用fit后访问.

Attributes

regr_ : list, shape=[n_regressors]

已拟合的回归器（原始回归器的克隆）
meta_regr_ : estimator

已拟合的元回归器（原始元估计器的克隆）
coef_ : array-like, shape = [n_features]

已拟合元估计器的模型系数
intercept_ : float

已拟合元估计器的截距
train_meta_features : numpy array,

shape = [n_samples, len(self.regressors)] 训练数据的元特征,其中n_samples是训练数据中的样本数量, len(self.regressors)是回归器的数量.
refit : bool (默认: True)

如果为True（默认）,则为堆叠回归克隆回归器, 否则使用原始回归器,这些回归器将在调用fit方法时在数据集上重新拟合. 如果你使用的是支持scikit-learn fit/predict API接口但不兼容scikit-learn的clone函数的估计器, 建议设置refit=False.

Examples

有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/regressor/StackingRegressor/

Methods

fit(X, y, sample_weight=None)

学习每个回归器的训练数据权重系数.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

训练向量,其中 n_samples 是样本数量,n_features 是特征数量.
y : numpy array, shape = [n_samples] 或 [n_samples, n_targets]

目标值.仅当 self.multi_output 为 True 时支持多目标.
sample_weight : array-like, shape = [n_samples], 可选

作为 sample_weights 传递给回归器列表中的每个回归器以及 meta_regressor 的样本权重. 如果在 fit() 方法中某些回归器不支持 sample_weight,则会引发错误.

Returns

self : object

fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to `X` and `y` with optional parameters `fit_params`
and returns a transformed version of `X`.

Parameters

X : array-like of shape (n_samples, n_features)

Input samples.
y : array-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).
**fit_params : dict

Additional fit parameters.

Returns

X_new : ndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()

Get metadata routing of this object.

Please check :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.

Returns

routing : MetadataRequest

A :class:~sklearn.utils.metadata_routing.MetadataRequest encapsulating routing information.

get_params(deep=True)

返回用于GridSearch支持的估计器参数名称.

predict(X)

预测目标值.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

训练向量,其中 n_samples 是样本数量,n_features 是特征数量.

Returns

y_target : array-like, shape = [n_samples] 或 [n_samples, n_targets]

预测的目标值.

predict_meta_features(X)

获取测试数据的元特征.

Parameters

X : numpy数组, 形状为 [n_samples, n_features]

测试向量,其中 n_samples 是样本数量, n_features 是特征数量.

Returns

meta-features : numpy数组, 形状为 [n_samples, len(self.regressors)]

测试数据的元特征,其中 n_samples 是测试数据中的样本数量, len(self.regressors) 是回归器的数量.如果 self.multi_output 为 True, 则列数为 len(self.regressors) * n_targets

score(X, y, sample_weight=None)

Return the coefficient of determination of the prediction.

The coefficient of determination :math:`R^2` is defined as
:math:`(1 - \frac{u}{v})`, where :math:`u` is the residual

sum of squares ((y_true - y_pred)** 2).sum() and :math:v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum().

The best possible score is 1.0 and it can be negative (because the

model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a :math:R^2 score of 0.0.

Parameters

X : array-like of shape (n_samples, n_features)

Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y : array-like of shape (n_samples,) or (n_samples, n_outputs)

True values for X.
sample_weight : array-like of shape (n_samples,), default=None

Sample weights.

Returns

score : float

:math:R^2 of self.predict(X) w.r.t. y.

Notes

set_fit_request(self: mlxtend.regressor.stacking_regression.StackingRegressor, , sample_weight: Union[bool, NoneType, str] = '$UNCHANGED$') -> mlxtend.regressor.stacking_regression.StackingRegressor*

Request metadata passed to the fit method.

Note that this method is only relevant if
``enable_metadata_routing=True`` (see :func:`sklearn.set_config`).
Please see :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.

The options for each parameter are:

- ``True``: metadata is requested, and passed to ``fit`` if provided. The request is ignored if metadata is not provided.

- ``False``: metadata is not requested and the meta-estimator will not pass it to ``fit``.

- ``None``: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

- ``str``: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (``sklearn.utils.metadata_routing.UNCHANGED``) retains the
existing request. This allows you to change the request for some
parameters and not others.

.. versionadded:: 1.3

.. note::
This method is only relevant if this estimator is used as a
sub-estimator of a meta-estimator, e.g. used inside a
:class:`~sklearn.pipeline.Pipeline`. Otherwise it has no effect.

Parameters

sample_weight : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

Returns

self : object

The updated object.

set_output(, transform=None)*

Set output container.

See :ref:`sphx_glr_auto_examples_miscellaneous_plot_set_output.py`
for an example on how to use the API.

Parameters

transform : {"default", "pandas", "polars"}, default=None

Configure output of transform and fit_transform.
- "default": Default output format of a transformer
- "pandas": DataFrame output
- "polars": Polars output
- None: Transform configuration is unchanged
.. versionadded:: 1.4 "polars" option was added.

Returns

self : estimator instance

Estimator instance.

set_params(params)

设置此估计器的参数.

有效的参数键可以通过 ``get_params()`` 列出.

Returns

self

set_score_request(self: mlxtend.regressor.stacking_regression.StackingRegressor, , sample_weight: Union[bool, NoneType, str] = '$UNCHANGED$') -> mlxtend.regressor.stacking_regression.StackingRegressor*

Request metadata passed to the score method.

Note that this method is only relevant if
``enable_metadata_routing=True`` (see :func:`sklearn.set_config`).
Please see :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.

The options for each parameter are:

- ``True``: metadata is requested, and passed to ``score`` if provided. The request is ignored if metadata is not provided.

- ``False``: metadata is not requested and the meta-estimator will not pass it to ``score``.

- ``None``: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

- ``str``: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (``sklearn.utils.metadata_routing.UNCHANGED``) retains the
existing request. This allows you to change the request for some
parameters and not others.

.. versionadded:: 1.3

.. note::
This method is only relevant if this estimator is used as a
sub-estimator of a meta-estimator, e.g. used inside a
:class:`~sklearn.pipeline.Pipeline`. Otherwise it has no effect.

Parameters

sample_weight : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns

self : object

The updated object.

Properties

coef_

None

intercept_

None

named_regressors

None