.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/linear_model/plot_huber_vs_ridge.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_linear_model_plot_huber_vs_ridge.py: ======================================================= HuberRegressor vs Ridge on dataset with strong outliers ======================================================= Fit Ridge and HuberRegressor on a dataset with outliers. The example shows that the predictions in ridge are strongly influenced by the outliers present in the dataset. The Huber regressor is less influenced by the outliers since the model uses the linear loss for these. As the parameter epsilon is increased for the Huber regressor, the decision function approaches that of the ridge. .. GENERATED FROM PYTHON SOURCE LINES 15-65 .. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_huber_vs_ridge_001.png :alt: Comparison of HuberRegressor vs Ridge :srcset: /auto_examples/linear_model/images/sphx_glr_plot_huber_vs_ridge_001.png :class: sphx-glr-single-img .. code-block:: Python # Authors: Manoj Kumar mks542@nyu.edu # License: BSD 3 clause import matplotlib.pyplot as plt import numpy as np from sklearn.datasets import make_regression from sklearn.linear_model import HuberRegressor, Ridge # Generate toy data. rng = np.random.RandomState(0) X, y = make_regression( n_samples=20, n_features=1, random_state=0, noise=4.0, bias=100.0 ) # Add four strong outliers to the dataset. X_outliers = rng.normal(0, 0.5, size=(4, 1)) y_outliers = rng.normal(0, 2.0, size=4) X_outliers[:2, :] += X.max() + X.mean() / 4.0 X_outliers[2:, :] += X.min() - X.mean() / 4.0 y_outliers[:2] += y.min() - y.mean() / 4.0 y_outliers[2:] += y.max() + y.mean() / 4.0 X = np.vstack((X, X_outliers)) y = np.concatenate((y, y_outliers)) plt.plot(X, y, "b.") # Fit the huber regressor over a series of epsilon values. colors = ["r-", "b-", "y-", "m-"] x = np.linspace(X.min(), X.max(), 7) epsilon_values = [1, 1.5, 1.75, 1.9] for k, epsilon in enumerate(epsilon_values): huber = HuberRegressor(alpha=0.0, epsilon=epsilon) huber.fit(X, y) coef_ = huber.coef_ * x + huber.intercept_ plt.plot(x, coef_, colors[k], label="huber loss, %s" % epsilon) # Fit a ridge regressor to compare it to huber regressor. ridge = Ridge(alpha=0.0, random_state=0) ridge.fit(X, y) coef_ridge = ridge.coef_ coef_ = ridge.coef_ * x + ridge.intercept_ plt.plot(x, coef_, "g-", label="ridge regression") plt.title("Comparison of HuberRegressor vs Ridge") plt.xlabel("X") plt.ylabel("y") plt.legend(loc=0) plt.show() .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.098 seconds) .. _sphx_glr_download_auto_examples_linear_model_plot_huber_vs_ridge.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/linear_model/plot_huber_vs_ridge.ipynb :alt: Launch binder :width: 150 px .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../../lite/lab/?path=auto_examples/linear_model/plot_huber_vs_ridge.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_huber_vs_ridge.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_huber_vs_ridge.py ` .. include:: plot_huber_vs_ridge.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_