.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/linear_model/plot_ransac.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_linear_model_plot_ransac.py: =========================================== Robust linear model estimation using RANSAC =========================================== In this example, we see how to robustly fit a linear model to faulty data using the :ref:`RANSAC ` algorithm. The ordinary linear regressor is sensitive to outliers, and the fitted line can easily be skewed away from the true underlying relationship of data. The RANSAC regressor automatically splits the data into inliers and outliers, and the fitted line is determined only by the identified inliers. .. GENERATED FROM PYTHON SOURCE LINES 17-82 .. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_ransac_001.png :alt: plot ransac :srcset: /auto_examples/linear_model/images/sphx_glr_plot_ransac_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Estimated coefficients (true, linear regression, RANSAC): 82.1903908407869 [54.17236387] [82.08533159] | .. code-block:: Python # Authors: The scikit-learn developers # SPDX-License-Identifier: BSD-3-Clause import numpy as np from matplotlib import pyplot as plt from sklearn import datasets, linear_model n_samples = 1000 n_outliers = 50 X, y, coef = datasets.make_regression( n_samples=n_samples, n_features=1, n_informative=1, noise=10, coef=True, random_state=0, ) # Add outlier data np.random.seed(0) X[:n_outliers] = 3 + 0.5 * np.random.normal(size=(n_outliers, 1)) y[:n_outliers] = -3 + 10 * np.random.normal(size=n_outliers) # Fit line using all data lr = linear_model.LinearRegression() lr.fit(X, y) # Robustly fit linear model with RANSAC algorithm ransac = linear_model.RANSACRegressor() ransac.fit(X, y) inlier_mask = ransac.inlier_mask_ outlier_mask = np.logical_not(inlier_mask) # Predict data of estimated models line_X = np.arange(X.min(), X.max())[:, np.newaxis] line_y = lr.predict(line_X) line_y_ransac = ransac.predict(line_X) # Compare estimated coefficients print("Estimated coefficients (true, linear regression, RANSAC):") print(coef, lr.coef_, ransac.estimator_.coef_) lw = 2 plt.scatter( X[inlier_mask], y[inlier_mask], color="yellowgreen", marker=".", label="Inliers" ) plt.scatter( X[outlier_mask], y[outlier_mask], color="gold", marker=".", label="Outliers" ) plt.plot(line_X, line_y, color="navy", linewidth=lw, label="Linear regressor") plt.plot( line_X, line_y_ransac, color="cornflowerblue", linewidth=lw, label="RANSAC regressor", ) plt.legend(loc="lower right") plt.xlabel("Input") plt.ylabel("Response") plt.show() .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.089 seconds) .. _sphx_glr_download_auto_examples_linear_model_plot_ransac.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/1.6.X?urlpath=lab/tree/notebooks/auto_examples/linear_model/plot_ransac.ipynb :alt: Launch binder :width: 150 px .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../../lite/lab/index.html?path=auto_examples/linear_model/plot_ransac.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_ransac.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_ransac.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_ransac.zip ` .. include:: plot_ransac.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_