.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/linear_model/plot_robust_fit.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_linear_model_plot_robust_fit.py: 稳健线性估计拟合 =============================== 这里使用三阶多项式拟合接近零值的正弦函数。 在不同情况下演示了稳健拟合: - 没有测量误差,只有建模误差(用多项式拟合正弦函数) - X中的测量误差 - y中的测量误差 使用对非损坏新数据的中位绝对偏差来判断预测质量。 我们可以看到: - RANSAC适用于y方向上的强异常值 - TheilSen适用于X和y方向上的小异常值,但有一个临界点,超过该点后其表现不如OLS。 - HuberRegressor的评分不能直接与TheilSen和RANSAC比较,因为它不试图完全过滤异常值,而是减轻它们的影响。 .. GENERATED FROM PYTHON SOURCE LINES 26-111 .. rst-class:: sphx-glr-horizontal * .. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_001.png :alt: Modeling Errors Only :srcset: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_001.png :class: sphx-glr-multi-img * .. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_002.png :alt: Corrupt X, Small Deviants :srcset: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_002.png :class: sphx-glr-multi-img * .. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_003.png :alt: Corrupt y, Small Deviants :srcset: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_003.png :class: sphx-glr-multi-img * .. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_004.png :alt: Corrupt X, Large Deviants :srcset: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_004.png :class: sphx-glr-multi-img * .. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_005.png :alt: Corrupt y, Large Deviants :srcset: /auto_examples/linear_model/images/sphx_glr_plot_robust_fit_005.png :class: sphx-glr-multi-img .. code-block:: Python import numpy as np from matplotlib import pyplot as plt from sklearn.linear_model import ( HuberRegressor, LinearRegression, RANSACRegressor, TheilSenRegressor, ) from sklearn.metrics import mean_squared_error from sklearn.pipeline import make_pipeline from sklearn.preprocessing import PolynomialFeatures np.random.seed(42) X = np.random.normal(size=400) y = np.sin(X) # 确保它 X 是二维的 X = X[:, np.newaxis] X_test = np.random.normal(size=200) y_test = np.sin(X_test) X_test = X_test[:, np.newaxis] y_errors = y.copy() y_errors[::3] = 3 X_errors = X.copy() X_errors[::3] = 3 y_errors_large = y.copy() y_errors_large[::3] = 10 X_errors_large = X.copy() X_errors_large[::3] = 10 estimators = [ ("OLS", LinearRegression()), ("Theil-Sen", TheilSenRegressor(random_state=42)), ("RANSAC", RANSACRegressor(random_state=42)), ("HuberRegressor", HuberRegressor()), ] colors = { "OLS": "turquoise", "Theil-Sen": "gold", "RANSAC": "lightgreen", "HuberRegressor": "black", } linestyle = {"OLS": "-", "Theil-Sen": "-.", "RANSAC": "--", "HuberRegressor": "--"} lw = 3 x_plot = np.linspace(X.min(), X.max()) for title, this_X, this_y in [ ("Modeling Errors Only", X, y), ("Corrupt X, Small Deviants", X_errors, y), ("Corrupt y, Small Deviants", X, y_errors), ("Corrupt X, Large Deviants", X_errors_large, y), ("Corrupt y, Large Deviants", X, y_errors_large), ]: plt.figure(figsize=(5, 4)) plt.plot(this_X[:, 0], this_y, "b+") for name, estimator in estimators: model = make_pipeline(PolynomialFeatures(3), estimator) model.fit(this_X, this_y) mse = mean_squared_error(model.predict(X_test), y_test) y_plot = model.predict(x_plot[:, np.newaxis]) plt.plot( x_plot, y_plot, color=colors[name], linestyle=linestyle[name], linewidth=lw, label="%s: error = %.3f" % (name, mse), ) legend_title = "Error of Mean\nAbsolute Deviation\nto Non-corrupt Data" legend = plt.legend( loc="upper right", frameon=False, title=legend_title, prop=dict(size="x-small") ) plt.xlim(-4, 10.2) plt.ylim(-2, 10.2) plt.title(title) plt.show() .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.952 seconds) .. _sphx_glr_download_auto_examples_linear_model_plot_robust_fit.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/linear_model/plot_robust_fit.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_robust_fit.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_robust_fit.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_robust_fit.zip ` .. include:: plot_robust_fit.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_