.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/gaussian_process/plot_gpr_noisy_targets.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_gaussian_process_plot_gpr_noisy_targets.py: ========================================================= 高斯过程回归:基础入门示例 ========================================================= 一个简单的一维回归示例,通过两种不同的方式计算: 1. 无噪声情况 2. 每个数据点具有已知噪声水平的有噪声情况 在这两种情况下,核函数的参数都是通过最大似然原则估计的。 图形展示了高斯过程模型的插值特性以及其概率性质,以点对点的95%置信区间的形式表现。 注意, `alpha` 是一个参数,用于控制假定训练点协方差矩阵上的Tikhonov正则化强度。 .. GENERATED FROM PYTHON SOURCE LINES 17-21 .. code-block:: Python # 作者:scikit-learn 开发者 # SPDX 许可证标识符:BSD-3-Clause .. GENERATED FROM PYTHON SOURCE LINES 22-26 数据集生成 ------------------ 我们将从生成一个合成数据集开始。真实的生成过程定义为 :math:`f(x) = x \sin(x)` 。 .. GENERATED FROM PYTHON SOURCE LINES 26-31 .. code-block:: Python import numpy as np X = np.linspace(start=0, stop=10, num=1_000).reshape(-1, 1) y = np.squeeze(X * np.sin(X)) .. GENERATED FROM PYTHON SOURCE LINES 32-40 .. code-block:: Python import matplotlib.pyplot as plt plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted") plt.legend() plt.xlabel("$x$") plt.ylabel("$f(x)$") _ = plt.title("True generative process") .. image-sg:: /auto_examples/gaussian_process/images/sphx_glr_plot_gpr_noisy_targets_001.png :alt: True generative process :srcset: /auto_examples/gaussian_process/images/sphx_glr_plot_gpr_noisy_targets_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 41-47 我们将在下一个实验中使用此数据集来说明高斯过程回归的工作原理。 无噪声目标示例 ------------------------------ 在第一个示例中,我们将使用真实的生成过程而不添加任何噪声。为了训练高斯过程回归,我们将只选择少量样本。 .. GENERATED FROM PYTHON SOURCE LINES 47-51 .. code-block:: Python rng = np.random.RandomState(1) training_indices = rng.choice(np.arange(y.size), size=6, replace=False) X_train, y_train = X[training_indices], y[training_indices] .. GENERATED FROM PYTHON SOURCE LINES 52-53 现在,我们在这些少量的训练数据样本上拟合一个高斯过程。我们将使用径向基函数(RBF)核和一个常数参数来拟合振幅。 .. GENERATED FROM PYTHON SOURCE LINES 53-62 .. code-block:: Python from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import RBF kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2)) gaussian_process = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9) gaussian_process.fit(X_train, y_train) gaussian_process.kernel_ .. rst-class:: sphx-glr-script-out .. code-block:: none 5.02**2 * RBF(length_scale=1.43) .. GENERATED FROM PYTHON SOURCE LINES 63-64 在拟合我们的模型后,我们看到核函数的超参数已经被优化。现在,我们将使用我们的核函数计算整个数据集的均值预测,并绘制95%的置信区间。 .. GENERATED FROM PYTHON SOURCE LINES 64-82 .. code-block:: Python mean_prediction, std_prediction = gaussian_process.predict(X, return_std=True) plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted") plt.scatter(X_train, y_train, label="Observations") plt.plot(X, mean_prediction, label="Mean prediction") plt.fill_between( X.ravel(), mean_prediction - 1.96 * std_prediction, mean_prediction + 1.96 * std_prediction, alpha=0.5, label=r"95% confidence interval", ) plt.legend() plt.xlabel("$x$") plt.ylabel("$f(x)$") _ = plt.title("Gaussian process regression on noise-free dataset") .. image-sg:: /auto_examples/gaussian_process/images/sphx_glr_plot_gpr_noisy_targets_002.png :alt: Gaussian process regression on noise-free dataset :srcset: /auto_examples/gaussian_process/images/sphx_glr_plot_gpr_noisy_targets_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 83-91 我们看到,对于在接近训练集数据点上做出的预测,95%置信区间的幅度较小。每当样本远离训练数据时,我们模型的预测就不那么准确,模型预测的精度也较低(不确定性较高)。 带有噪声目标的示例 -------------------------- 我们可以重复类似的实验,这次在目标中添加额外的噪声。这将使我们能够看到噪声对拟合模型的影响。 我们向目标添加一些具有任意标准差的随机高斯噪声。 .. GENERATED FROM PYTHON SOURCE LINES 91-94 .. code-block:: Python noise_std = 0.75 y_train_noisy = y_train + rng.normal(loc=0.0, scale=noise_std, size=y_train.shape) .. GENERATED FROM PYTHON SOURCE LINES 95-96 我们创建了一个类似的高斯过程模型。除了核函数之外,这次我们还指定了参数 `alpha` ,它可以解释为高斯噪声的方差。 .. GENERATED FROM PYTHON SOURCE LINES 96-103 .. code-block:: Python gaussian_process = GaussianProcessRegressor( kernel=kernel, alpha=noise_std**2, n_restarts_optimizer=9 ) gaussian_process.fit(X_train, y_train_noisy) mean_prediction, std_prediction = gaussian_process.predict(X, return_std=True) .. GENERATED FROM PYTHON SOURCE LINES 104-105 让我们像之前一样绘制平均预测值和不确定性区域。 .. GENERATED FROM PYTHON SOURCE LINES 105-131 .. code-block:: Python plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted") plt.errorbar( X_train, y_train_noisy, noise_std, linestyle="None", color="tab:blue", marker=".", markersize=10, label="Observations", ) plt.plot(X, mean_prediction, label="Mean prediction") plt.fill_between( X.ravel(), mean_prediction - 1.96 * std_prediction, mean_prediction + 1.96 * std_prediction, color="tab:orange", alpha=0.5, label=r"95% confidence interval", ) plt.legend() plt.xlabel("$x$") plt.ylabel("$f(x)$") _ = plt.title("Gaussian process regression on a noisy dataset") .. image-sg:: /auto_examples/gaussian_process/images/sphx_glr_plot_gpr_noisy_targets_003.png :alt: Gaussian process regression on a noisy dataset :srcset: /auto_examples/gaussian_process/images/sphx_glr_plot_gpr_noisy_targets_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 132-133 噪声影响接近训练样本的预测:由于我们明确建模了与输入变量无关的特定水平的目标噪声,因此训练样本附近的预测不确定性更大。 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.242 seconds) .. _sphx_glr_download_auto_examples_gaussian_process_plot_gpr_noisy_targets.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/gaussian_process/plot_gpr_noisy_targets.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_gpr_noisy_targets.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_gpr_noisy_targets.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_gpr_noisy_targets.zip ` .. include:: plot_gpr_noisy_targets.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_