.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/model_selection/plot_cv_predict.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_model_selection_plot_cv_predict.py: ==================================== 绘制交叉验证预测 ==================================== 本示例展示了如何使用 :func:`~sklearn.model_selection.cross_val_predict` 以及 :class:`~sklearn.metrics.PredictionErrorDisplay` 来可视化预测误差。 .. GENERATED FROM PYTHON SOURCE LINES 12-13 我们将加载糖尿病数据集并创建一个线性回归模型实例。 .. GENERATED FROM PYTHON SOURCE LINES 13-20 .. code-block:: Python from sklearn.datasets import load_diabetes from sklearn.linear_model import LinearRegression X, y = load_diabetes(return_X_y=True) lr = LinearRegression() .. GENERATED FROM PYTHON SOURCE LINES 21-22 :func:`~sklearn.model_selection.cross_val_predict` 返回一个与 `y` 大小相同的数组,其中每个条目都是通过交叉验证获得的预测。 .. GENERATED FROM PYTHON SOURCE LINES 22-27 .. code-block:: Python from sklearn.model_selection import cross_val_predict y_pred = cross_val_predict(lr, X, y, cv=10) .. GENERATED FROM PYTHON SOURCE LINES 28-31 由于 `cv=10` ,这意味着我们训练了10个模型,每个模型用于对10个折中的一个进行预测。我们现在可以使用 :class:`~sklearn.metrics.PredictionErrorDisplay` 来可视化预测误差。 在左轴上,我们绘制观测值 :math:`y` 与模型给出的预测值 :math:`\hat{y}` 的关系图。在右轴上,我们绘制残差(即观测值与预测值之间的差异)与预测值的关系图。 .. GENERATED FROM PYTHON SOURCE LINES 31-58 .. code-block:: Python import matplotlib.pyplot as plt from sklearn.metrics import PredictionErrorDisplay fig, axs = plt.subplots(ncols=2, figsize=(8, 4)) PredictionErrorDisplay.from_predictions( y, y_pred=y_pred, kind="actual_vs_predicted", subsample=100, ax=axs[0], random_state=0, ) axs[0].set_title("Actual vs. Predicted values") PredictionErrorDisplay.from_predictions( y, y_pred=y_pred, kind="residual_vs_predicted", subsample=100, ax=axs[1], random_state=0, ) axs[1].set_title("Residuals vs. Predicted Values") fig.suptitle("Plotting cross-validated predictions") plt.tight_layout() plt.show() .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_cv_predict_001.png :alt: Plotting cross-validated predictions, Actual vs. Predicted values, Residuals vs. Predicted Values :srcset: /auto_examples/model_selection/images/sphx_glr_plot_cv_predict_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 59-64 需要注意的是,在本例中我们使用了 :func:`~sklearn.model_selection.cross_val_predict` 仅用于可视化目的。 当不同的交叉验证(CV)折叠在大小和分布上有所不同时,通过计算从 :func:`~sklearn.model_selection.cross_val_predict` 返回的连接预测值的单一性能指标来定量评估模型性能将是有问题的。 建议使用 :func:`~sklearn.model_selection.cross_val_score` 或 :func:`~sklearn.model_selection.cross_validate` 计算每折的性能指标。 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.080 seconds) .. _sphx_glr_download_auto_examples_model_selection_plot_cv_predict.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/model_selection/plot_cv_predict.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_cv_predict.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_cv_predict.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_cv_predict.zip ` .. include:: plot_cv_predict.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_