PLSRegression#

class sklearn.cross_decomposition.PLSRegression(n_components=2, *, scale=True, max_iter=500, tol=1e-06, copy=True)#

PLS回归。

PLSRegression 也被称为 PLS2 或 PLS1，取决于目标的数量。

有关与其他交叉分解算法的比较，请参见比较交叉分解方法。

更多信息请参阅用户指南。

Added in version 0.8.

Parameters:

n_componentsint, default=2: 保留的组件数量。应在 [1, n_features] 范围内。
scalebool, default=True: 是否缩放 X 和 Y 。
max_iterint, default=500: 当 algorithm='nipals' 时，幂方法的最大迭代次数。否则忽略。
tolfloat, default=1e-06: 在幂方法中用作收敛标准的容差：当 u_i - u_{i-1} 的平方范数小于 tol 时，算法停止，其中 u 对应于左奇异向量。
copybool, default=True: 是否在 fit 中复制 X 和 Y 之前进行中心化，并可能进行缩放。如果为 False ，这些操作将在原地进行，修改两个数组。

Attributes:

x_weights_ndarray of shape (n_features, n_components): 每次迭代中交叉协方差矩阵的左奇异向量。
y_weights_ndarray of shape (n_targets, n_components): 每次迭代中交叉协方差矩阵的右奇异向量。
x_loadings_ndarray of shape (n_features, n_components): X 的载荷。
y_loadings_ndarray of shape (n_targets, n_components): Y 的载荷。
x_scores_ndarray of shape (n_samples, n_components): 变换后的训练样本。
y_scores_ndarray of shape (n_samples, n_components): 变换后的训练目标。
x_rotations_ndarray of shape (n_features, n_components): 用于变换 X 的投影矩阵。
y_rotations_ndarray of shape (n_targets, n_components): 用于变换 Y 的投影矩阵。
coef_ndarray of shape (n_target, n_features): 线性模型的系数，使得 Y 近似为 Y = X @ coef_.T + intercept_ 。
intercept_ndarray of shape (n_targets,): 线性模型的截距，使得 Y 近似为 Y = X @ coef_.T + intercept_ 。

Added in version 1.1.
n_iter_list of shape (n_components,): 每个组件的幂方法迭代次数。
n_features_in_int: 在 fit 期间看到的特征数量。
feature_names_in_ndarray of shape ( n_features_in_ ,): 在 fit 期间看到的特征名称。仅当 X 的所有特征名称均为字符串时定义。

Added in version 1.0.

See also

PLSCanonical: 偏最小二乘变换器和回归器。

Examples

>>> from sklearn.cross_decomposition import PLSRegression
>>> X = [[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]]
>>> y = [[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]]
>>> pls2 = PLSRegression(n_components=2)
>>> pls2.fit(X, y)
PLSRegression()
>>> Y_pred = pls2.predict(X)

有关 PLS 回归与 PCA 的比较，请参见主成分回归与偏最小二乘回归。

fit(X, y=None, Y=None)#

拟合模型到数据。

Parameters:

X形状为 (n_samples, n_features) 的类数组: 训练向量，其中 n_samples 是样本数量， n_features 是预测变量数量。
y形状为 (n_samples,) 或 (n_samples, n_targets) 的类数组: 目标向量，其中 n_samples 是样本数量， n_targets 是响应变量数量。
Y形状为 (n_samples,) 或 (n_samples, n_targets) 的类数组: 目标向量，其中 n_samples 是样本数量， n_targets 是响应变量数量。

Returns:

selfobject: 拟合的模型。

fit_transform(X, y=None)#

学习并应用降维到训练数据上。

Parameters:

X形状为 (n_samples, n_features) 的类数组: 训练向量，其中 n_samples 是样本数量， n_features 是预测变量数量。
y形状为 (n_samples, n_targets) 的类数组，默认=None: 目标向量，其中 n_samples 是样本数量， n_targets 是响应变量数量。

Returns:

self形状为 (n_samples, n_components) 的 ndarray: 如果未给出 Y ，则返回 x_scores ，否则返回 (x_scores, y_scores) 。

get_feature_names_out(input_features=None)#

获取转换后的输出特征名称。

输出特征名称将以小写的类名作为前缀。例如，如果转换器输出3个特征，那么输出特征名称将是： ["class_name0", "class_name1", "class_name2"] 。

Parameters:

input_features类似数组的对象或None，默认为None: 仅用于验证特征名称与 fit 中看到的名称。

Returns:

feature_names_outndarray of str对象: 转换后的特征名称。

get_metadata_routing()#

获取此对象的元数据路由。

请查看用户指南以了解路由机制的工作原理。

Returns:

routingMetadataRequest: MetadataRequest 封装的路由信息。

get_params(deep=True)#

获取此估计器的参数。

Parameters:

deepbool, 默认=True: 如果为True，将返回此估计器和包含的子对象（也是估计器）的参数。

Returns:

paramsdict: 参数名称映射到它们的值。

inverse_transform(X, y=None, Y=None)#

将数据转换回其原始空间。

Parameters:

X形状为 (n_samples, n_components) 的类数组: 新数据，其中 n_samples 是样本数量和 n_components 是 pls 组件的数量。
y形状为 (n_samples,) 或 (n_samples, n_components) 的类数组: 新目标，其中 n_samples 是样本数量和 n_components 是 pls 组件的数量。
Y形状为 (n_samples, n_components) 的类数组: 新目标，其中 n_samples 是样本数量和 n_components 是 pls 组件的数量。

Deprecated since version 1.5: Y 在 1.5 版本中已弃用，并将在 1.7 版本中移除。请使用 y 代替。

Returns:

X_reconstructed形状为 (n_samples, n_features) 的 ndarray: 返回重建的 X 数据。
y_reconstructed形状为 (n_samples, n_targets) 的 ndarray: 返回重建的 X 目标。仅当给定 y 时返回。

Notes

此转换只有在 n_components=n_features 时才是精确的。

predict(X, copy=True)#

预测给定样本的目标。

Parameters:

X形状为 (n_samples, n_features) 的类数组: 样本。
copy布尔值, 默认为 True: 是否复制 X 和 Y ，或者执行就地归一化。

Returns:

y_pred形状为 (n_samples,) 或 (n_samples, n_targets) 的 ndarray: 返回预测值。

Notes

此调用需要估计一个形状为 (n_features, n_targets) 的矩阵，在高维空间中可能会有问题。

score(X, y, sample_weight=None)#

返回预测的决定系数。

决定系数 $R^2$ 定义为 $(1 - rac{u}{v})$ ，其中 $u$ 是残差平方和 ((y_true - y_pred)** 2).sum() ，而 $v$ 是总平方和 ((y_true - y_true.mean()) ** 2).sum() 。最好的可能得分是 1.0，它可能是负的（因为模型可能任意地差）。一个总是预测 y 的期望值的常数模型，忽略输入特征，将得到 $R^2$ 得分为 0.0。

Parameters:

Xarray-like of shape (n_samples, n_features): 测试样本。对于某些估计器，这可能是一个预计算的核矩阵或一个形状为 (n_samples, n_samples_fitted) 的通用对象列表，其中 n_samples_fitted 是估计器拟合中使用的样本数量。
yarray-like of shape (n_samples,) or (n_samples, n_outputs): X 的真实值。
sample_weightarray-like of shape (n_samples,), default=None: 样本权重。

Returns:

scorefloat: $R^2$ 相对于 y 的 self.predict(X) 。

Notes

在调用回归器的 score 时使用的 $R^2$ 得分从 0.23 版本开始使用 multioutput='uniform_average' 以保持与 r2_score 默认值一致。这影响了所有多输出回归器的 score 方法（除了 MultiOutputRegressor ）。

set_output(*, transform=None)#

设置输出容器。

请参阅介绍 set_output API 以了解如何使用API的示例。

Parameters:

transform{“default”, “pandas”, “polars”}, 默认=None

配置 transform 和 fit_transform 的输出。

"default" : 转换器的默认输出格式
"pandas" : DataFrame 输出
"polars" : Polars 输出
None : 转换配置不变

Added in version 1.4: "polars" 选项已添加。

Returns:

self估计器实例: 估计器实例。

set_params(**params)#

设置此估计器的参数。

该方法适用于简单估计器以及嵌套对象（例如 Pipeline ）。后者具有形式为 <component>__<parameter> 的参数，以便可以更新嵌套对象的每个组件。

Parameters:

**paramsdict: 估计器参数。

Returns:

selfestimator instance: 估计器实例。

set_predict_request(*, copy: bool | None | str = '$UNCHANGED$') → PLSRegression#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True : metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False : metadata is not requested and the meta-estimator will not pass it to predict .
None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters:

copystr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for copy parameter in predict .

Returns:

selfobject: The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → PLSRegression#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True : metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False : metadata is not requested and the meta-estimator will not pass it to score .
None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score .

Returns:

selfobject: The updated object.

set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') → PLSRegression#

Request metadata passed to the transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True : metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.
False : metadata is not requested and the meta-estimator will not pass it to transform .
None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters:

copystr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for copy parameter in transform .

Returns:

selfobject: The updated object.

transform(X, y=None, Y=None, copy=True)#

应用降维。

Parameters:

X形状为 (n_samples, n_features) 的类数组: 要转换的样本。
y形状为 (n_samples, n_targets) 的类数组，默认=None: 目标向量。
Y形状为 (n_samples, n_targets) 的类数组，默认=None: 目标向量。

Deprecated since version 1.5: Y 在 1.5 版本中已弃用，并将在 1.7 版本中移除。请改用 y 。
copybool, 默认=True: 是否复制 X 和 Y ，或者执行就地归一化。

Returns:

x_scores, y_scores类数组或类数组元组: 如果未给出 Y ，则返回 x_scores ，否则返回 (x_scores, y_scores) 。

Gallery examples#

主成分回归与偏最小二乘回归

比较交叉分解方法