.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/ensemble/plot_gradient_boosting_regularization.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_ensemble_plot_gradient_boosting_regularization.py>`
        to download the full example code. or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_ensemble_plot_gradient_boosting_regularization.py:


===============================
梯度提升正则化
================================

展示了不同正则化策略对梯度提升的影响。该示例取自 Hastie 等人 2009 [1]_。

使用的损失函数是二项偏差。通过缩减( ``learning_rate < 1.0`` )进行正则化可以显著提高性能。结合缩减,随机梯度提升( ``subsample < 1.0`` )可以通过袋装法减少方差,从而产生更准确的模型。没有缩减的子采样通常表现不佳。另一种减少方差的策略是通过子采样特征,类似于随机森林中的随机分裂(通过 ``max_features`` 参数)。

.. [1] T. Hastie, R. Tibshirani 和 J. Friedman, "统计学习要素 第2版", Springer, 2009.

.. GENERATED FROM PYTHON SOURCE LINES 13-81



.. image-sg:: /auto_examples/ensemble/images/sphx_glr_plot_gradient_boosting_regularization_001.png
   :alt: plot gradient boosting regularization
   :srcset: /auto_examples/ensemble/images/sphx_glr_plot_gradient_boosting_regularization_001.png
   :class: sphx-glr-single-img





.. code-block:: Python


    # 作者:scikit-learn 开发者
    # SPDX-License-Identifier: BSD-3-Clause

    import matplotlib.pyplot as plt
    import numpy as np

    from sklearn import datasets, ensemble
    from sklearn.metrics import log_loss
    from sklearn.model_selection import train_test_split

    X, y = datasets.make_hastie_10_2(n_samples=4000, random_state=1)

    # 将标签从 {-1, 1} 映射到 {0, 1}
    labels, y = np.unique(y, return_inverse=True)

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.8, random_state=0)

    original_params = {
        "n_estimators": 400,
        "max_leaf_nodes": 4,
        "max_depth": None,
        "random_state": 2,
        "min_samples_split": 5,
    }

    plt.figure()

    for label, color, setting in [
        ("No shrinkage", "orange", {"learning_rate": 1.0, "subsample": 1.0}),
        ("learning_rate=0.2", "turquoise", {"learning_rate": 0.2, "subsample": 1.0}),
        ("subsample=0.5", "blue", {"learning_rate": 1.0, "subsample": 0.5}),
        (
            "learning_rate=0.2, subsample=0.5",
            "gray",
            {"learning_rate": 0.2, "subsample": 0.5},
        ),
        (
            "learning_rate=0.2, max_features=2",
            "magenta",
            {"learning_rate": 0.2, "max_features": 2},
        ),
    ]:
        params = dict(original_params)
        params.update(setting)

        clf = ensemble.GradientBoostingClassifier(**params)
        clf.fit(X_train, y_train)

        # 计算测试集偏差
        test_deviance = np.zeros((params["n_estimators"],), dtype=np.float64)

        for i, y_proba in enumerate(clf.staged_predict_proba(X_test)):
            test_deviance[i] = 2 * log_loss(y_test, y_proba[:, 1])

        plt.plot(
            (np.arange(test_deviance.shape[0]) + 1)[::5],
            test_deviance[::5],
            "-",
            color=color,
            label=label,
        )

    plt.legend(loc="upper right")
    plt.xlabel("Boosting Iterations")
    plt.ylabel("Test Set Deviance")

    plt.show()


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 9.342 seconds)


.. _sphx_glr_download_auto_examples_ensemble_plot_gradient_boosting_regularization.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/ensemble/plot_gradient_boosting_regularization.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_gradient_boosting_regularization.ipynb <plot_gradient_boosting_regularization.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_gradient_boosting_regularization.py <plot_gradient_boosting_regularization.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_gradient_boosting_regularization.zip <plot_gradient_boosting_regularization.zip>`


.. include:: plot_gradient_boosting_regularization.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_