GammaRegressor#

class sklearn.linear_model.GammaRegressor(*, alpha=1.0, fit_intercept=True, solver='lbfgs', max_iter=100, tol=0.0001, warm_start=False, verbose=0)#

广义线性模型与Gamma分布。

该回归器使用“log”链接函数。

更多信息请参阅用户指南。

Added in version 0.23.

Parameters:

alphafloat, default=1

常数，乘以L2惩罚项并确定正则化强度。 alpha = 0 等同于未惩罚的 GLM。在这种情况下，设计矩阵 X 必须具有满列秩（无共线性）。 alpha 的值必须在 [0.0, inf) 范围内。

fit_interceptbool, default=True

指定是否应将常数（又名偏差或截距）添加到线性预测器 X @ coef_ + intercept_ 。

solver{‘lbfgs’, ‘newton-cholesky’}, default=’lbfgs’

优化问题中使用的算法：

‘lbfgs’: 调用 scipy 的 L-BFGS-B 优化器。
‘newton-cholesky’: 使用牛顿-拉夫森步骤（在任意精度算术中等同于迭代加权最小二乘法）与内部基于Cholesky的求解器。该求解器是 n_samples >> n_features 时的良好选择，特别是对于具有罕见类别的一热编码分类特征。请注意，此求解器的内存使用量对 n_features 具有二次依赖性，因为它显式计算Hessian矩阵。

Added in version 1.2.

max_iterint, default=100

求解器的最大迭代次数。值必须在 [1, inf) 范围内。

tolfloat, default=1e-4

停止标准。对于 lbfgs 求解器，迭代将在 max{|g_j|, j = 1, ..., d} <= tol 时停止其中 g_j 是目标函数梯度（导数）的第 j 个分量。值必须在 (0.0, inf) 范围内。

warm_startbool, default=False

如果设置为 True ，则重用上一次调用 fit 的解作为 coef_ 和 intercept_ 的初始化。

verboseint, default=0

对于 lbfgs 求解器，设置任何正数以启用详细输出。值必须在 [0, inf) 范围内。

Attributes:

coef_array of shape (n_features,): 在 GLM 中线性预测器 ( X @ coef_ + intercept_ ) 的估计系数。
intercept_float: 添加到线性预测器的截距（又名偏差）。
n_features_in_int: 在 fit 期间看到的特征数量。

Added in version 0.24.
n_iter_int: 求解器中实际使用的迭代次数。
feature_names_in_ndarray of shape ( n_features_in_ ,): 在 fit 期间看到的特征名称。仅当 X 的特征名称均为字符串时定义。

Added in version 1.0.

See also

PoissonRegressor: 具有泊松分布的广义线性模型。
TweedieRegressor: 具有 Tweedie 分布的广义线性模型。

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.GammaRegressor()
>>> X = [[1, 2], [2, 3], [3, 4], [4, 3]]
>>> y = [19, 26, 33, 30]
>>> clf.fit(X, y)
GammaRegressor()
>>> clf.score(X, y)
0.773...
>>> clf.coef_
array([0.072..., 0.066...])
>>> clf.intercept_
2.896...
>>> clf.predict([[1, 0], [2, 8]])
array([19.483..., 35.795...])

fit(X, y, sample_weight=None)#

拟合广义线性模型。

Parameters:

X{array-like, sparse matrix}，形状为 (n_samples, n_features): 训练数据。
yarray-like，形状为 (n_samples,): 目标值。
sample_weightarray-like，形状为 (n_samples,)，默认=None: 样本权重。

Returns:

selfobject: 拟合的模型。

get_metadata_routing()#

获取此对象的元数据路由。

请查看用户指南以了解路由机制的工作原理。

Returns:

routingMetadataRequest: MetadataRequest 封装的路由信息。

get_params(deep=True)#

获取此估计器的参数。

Parameters:

deepbool, 默认=True: 如果为True，将返回此估计器和包含的子对象（也是估计器）的参数。

Returns:

paramsdict: 参数名称映射到它们的值。

predict(X)#

使用带有特征矩阵X的GLM进行预测。

Parameters:

X{array-like, sparse matrix}，形状为 (n_samples, n_features): 样本。

Returns:

y_pred形状为 (n_samples,) 的数组: 返回预测值。

score(X, y, sample_weight=None)#

计算 D^2，即解释的偏差百分比。

D^2 是决定系数 R^2 的推广。R^2 使用平方误差，而 D^2 使用此 GLM 的偏差，详见用户指南。

D^2 定义为 $D^2 = 1- rac{D(y_{true},y_{pred})}{D_{null}}$ ， $D_{null}$ 是空偏差，即仅包含截距的模型的偏差，对应于 $y_{pred} = ar{y}$ 。均值 $ar{y}$ 通过 sample_weight 进行平均。最佳得分是 1.0，也可能为负（因为模型可能任意糟糕）。

Parameters:

X{array-like, sparse matrix}，形状 (n_samples, n_features): 测试样本。
yarray-like，形状 (n_samples,): 目标的真实值。
sample_weightarray-like，形状 (n_samples,)，默认=None: 样本权重。

Returns:

scorefloat: self.predict(X) 相对于 y 的 D^2。

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → GammaRegressor#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True : metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False : metadata is not requested and the meta-estimator will not pass it to fit .
None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit .

Returns:

selfobject: The updated object.

set_params(**params)#

设置此估计器的参数。

该方法适用于简单估计器以及嵌套对象（例如 Pipeline ）。后者具有形式为 <component>__<parameter> 的参数，以便可以更新嵌套对象的每个组件。

Parameters:

**paramsdict: 估计器参数。

Returns:

selfestimator instance: 估计器实例。

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → GammaRegressor#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True : metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False : metadata is not requested and the meta-estimator will not pass it to score .
None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score .

Returns:

selfobject: The updated object.

Gallery examples#

scikit-learn 0.23 版本发布亮点

Tweedie回归在保险理赔中的应用