.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/linear_model/plot_lasso_lars_ic.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_linear_model_plot_lasso_lars_ic.py>`
        to download the full example code or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_linear_model_plot_lasso_lars_ic.py:


==============================================
Lasso model selection via information criteria
==============================================

This example reproduces the example of Fig. 2 of [ZHT2007]_. A
:class:`~sklearn.linear_model.LassoLarsIC` estimator is fit on a
diabetes dataset and the AIC and the BIC criteria are used to select
the best model.

.. note::
    It is important to note that the optimization to find `alpha` with
    :class:`~sklearn.linear_model.LassoLarsIC` relies on the AIC or BIC
    criteria that are computed in-sample, thus on the training set directly.
    This approach differs from the cross-validation procedure. For a comparison
    of the two approaches, you can refer to the following example:
    :ref:`sphx_glr_auto_examples_linear_model_plot_lasso_model_selection.py`.

.. topic:: References

    .. [ZHT2007] :arxiv:`Zou, Hui, Trevor Hastie, and Robert Tibshirani.
       "On the degrees of freedom of the lasso."
       The Annals of Statistics 35.5 (2007): 2173-2192.
       <0712.0881>`

.. GENERATED FROM PYTHON SOURCE LINES 26-31

.. code-block:: default


    # Author: Alexandre Gramfort
    #         Guillaume Lemaitre
    # License: BSD 3 clause








.. GENERATED FROM PYTHON SOURCE LINES 32-33

We will use the diabetes dataset.

.. GENERATED FROM PYTHON SOURCE LINES 33-39

.. code-block:: default

    from sklearn.datasets import load_diabetes

    X, y = load_diabetes(return_X_y=True, as_frame=True)
    n_samples = X.shape[0]
    X.head()






.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>age</th>
          <th>sex</th>
          <th>bmi</th>
          <th>bp</th>
          <th>s1</th>
          <th>s2</th>
          <th>s3</th>
          <th>s4</th>
          <th>s5</th>
          <th>s6</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>0.038076</td>
          <td>0.050680</td>
          <td>0.061696</td>
          <td>0.021872</td>
          <td>-0.044223</td>
          <td>-0.034821</td>
          <td>-0.043401</td>
          <td>-0.002592</td>
          <td>0.019907</td>
          <td>-0.017646</td>
        </tr>
        <tr>
          <th>1</th>
          <td>-0.001882</td>
          <td>-0.044642</td>
          <td>-0.051474</td>
          <td>-0.026328</td>
          <td>-0.008449</td>
          <td>-0.019163</td>
          <td>0.074412</td>
          <td>-0.039493</td>
          <td>-0.068332</td>
          <td>-0.092204</td>
        </tr>
        <tr>
          <th>2</th>
          <td>0.085299</td>
          <td>0.050680</td>
          <td>0.044451</td>
          <td>-0.005670</td>
          <td>-0.045599</td>
          <td>-0.034194</td>
          <td>-0.032356</td>
          <td>-0.002592</td>
          <td>0.002861</td>
          <td>-0.025930</td>
        </tr>
        <tr>
          <th>3</th>
          <td>-0.089063</td>
          <td>-0.044642</td>
          <td>-0.011595</td>
          <td>-0.036656</td>
          <td>0.012191</td>
          <td>0.024991</td>
          <td>-0.036038</td>
          <td>0.034309</td>
          <td>0.022688</td>
          <td>-0.009362</td>
        </tr>
        <tr>
          <th>4</th>
          <td>0.005383</td>
          <td>-0.044642</td>
          <td>-0.036385</td>
          <td>0.021872</td>
          <td>0.003935</td>
          <td>0.015596</td>
          <td>0.008142</td>
          <td>-0.002592</td>
          <td>-0.031988</td>
          <td>-0.046641</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 40-48

Scikit-learn provides an estimator called
:class:`~sklearn.linear_model.LinearLarsIC` that uses either Akaike's
information criterion (AIC) or the Bayesian information criterion (BIC) to
select the best model. Before fitting
this model, we will scale the dataset.

In the following, we are going to fit two models to compare the values
reported by AIC and BIC.

.. GENERATED FROM PYTHON SOURCE LINES 48-55

.. code-block:: default

    from sklearn.preprocessing import StandardScaler
    from sklearn.linear_model import LassoLarsIC
    from sklearn.pipeline import make_pipeline

    lasso_lars_ic = make_pipeline(StandardScaler(), LassoLarsIC(criterion="aic")).fit(X, y)









.. GENERATED FROM PYTHON SOURCE LINES 56-61

To be in line with the definition in [ZHT2007]_, we need to rescale the
AIC and the BIC. Indeed, Zou et al. are ignoring some constant terms
compared to the original definition of AIC derived from the maximum
log-likelihood of a linear model. You can refer to
:ref:`mathematical detail section for the User Guide <lasso_lars_ic>`.

.. GENERATED FROM PYTHON SOURCE LINES 61-66

.. code-block:: default

    def zou_et_al_criterion_rescaling(criterion, n_samples, noise_variance):
        """Rescale the information criterion to follow the definition of Zou et al."""
        return criterion - n_samples * np.log(2 * np.pi * noise_variance) - n_samples









.. GENERATED FROM PYTHON SOURCE LINES 67-79

.. code-block:: default

    import numpy as np

    aic_criterion = zou_et_al_criterion_rescaling(
        lasso_lars_ic[-1].criterion_,
        n_samples,
        lasso_lars_ic[-1].noise_variance_,
    )

    index_alpha_path_aic = np.flatnonzero(
        lasso_lars_ic[-1].alphas_ == lasso_lars_ic[-1].alpha_
    )[0]








.. GENERATED FROM PYTHON SOURCE LINES 80-92

.. code-block:: default

    lasso_lars_ic.set_params(lassolarsic__criterion="bic").fit(X, y)

    bic_criterion = zou_et_al_criterion_rescaling(
        lasso_lars_ic[-1].criterion_,
        n_samples,
        lasso_lars_ic[-1].noise_variance_,
    )

    index_alpha_path_bic = np.flatnonzero(
        lasso_lars_ic[-1].alphas_ == lasso_lars_ic[-1].alpha_
    )[0]








.. GENERATED FROM PYTHON SOURCE LINES 93-96

Now that we collected the AIC and BIC, we can as well check that the minima
of both criteria happen at the same alpha. Then, we can simplify the
following plot.

.. GENERATED FROM PYTHON SOURCE LINES 96-98

.. code-block:: default

    index_alpha_path_aic == index_alpha_path_bic





.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    True



.. GENERATED FROM PYTHON SOURCE LINES 99-101

Finally, we can plot the AIC and BIC criterion and the subsequent selected
regularization parameter.

.. GENERATED FROM PYTHON SOURCE LINES 101-117

.. code-block:: default

    import matplotlib.pyplot as plt

    plt.plot(aic_criterion, color="tab:blue", marker="o", label="AIC criterion")
    plt.plot(bic_criterion, color="tab:orange", marker="o", label="BIC criterion")
    plt.vlines(
        index_alpha_path_bic,
        aic_criterion.min(),
        aic_criterion.max(),
        color="black",
        linestyle="--",
        label="Selected alpha",
    )
    plt.legend()
    plt.ylabel("Information criterion")
    plt.xlabel("Lasso model sequence")
    _ = plt.title("Lasso model selection via AIC and BIC")



.. image-sg:: /auto_examples/linear_model/images/sphx_glr_plot_lasso_lars_ic_001.png
   :alt: Lasso model selection via AIC and BIC
   :srcset: /auto_examples/linear_model/images/sphx_glr_plot_lasso_lars_ic_001.png
   :class: sphx-glr-single-img






.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.130 seconds)


.. _sphx_glr_download_auto_examples_linear_model_plot_lasso_lars_ic.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/1.2.X?urlpath=lab/tree/notebooks/auto_examples/linear_model/plot_lasso_lars_ic.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_lasso_lars_ic.py <plot_lasso_lars_ic.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_lasso_lars_ic.ipynb <plot_lasso_lars_ic.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_