.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/decomposition/plot_ica_vs_pca.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_decomposition_plot_ica_vs_pca.py>`
        to download the full example code or to run this example in your browser via JupyterLite or Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_decomposition_plot_ica_vs_pca.py:


==========================
FastICA on 2D point clouds
==========================

This example illustrates visually in the feature space a comparison by
results using two different component analysis techniques.

:ref:`ICA` vs :ref:`PCA`.

Representing ICA in the feature space gives the view of 'geometric ICA':
ICA is an algorithm that finds directions in the feature space
corresponding to projections with high non-Gaussianity. These directions
need not be orthogonal in the original feature space, but they are
orthogonal in the whitened feature space, in which all directions
correspond to the same variance.

PCA, on the other hand, finds orthogonal directions in the raw feature
space that correspond to directions accounting for maximum variance.

Here we simulate independent sources using a highly non-Gaussian
process, 2 student T with a low number of degrees of freedom (top left
figure). We mix them to create observations (top right figure).
In this raw observation space, directions identified by PCA are
represented by orange vectors. We represent the signal in the PCA space,
after whitening by the variance corresponding to the PCA vectors (lower
left). Running ICA corresponds to finding a rotation in this space to
identify the directions of largest non-Gaussianity (lower right).

.. GENERATED FROM PYTHON SOURCE LINES 31-35

.. code-block:: default


    # Authors: Alexandre Gramfort, Gael Varoquaux
    # License: BSD 3 clause








.. GENERATED FROM PYTHON SOURCE LINES 36-38

Generate sample data
--------------------

.. GENERATED FROM PYTHON SOURCE LINES 38-60

.. code-block:: default

    import numpy as np

    from sklearn.decomposition import PCA, FastICA

    rng = np.random.RandomState(42)
    S = rng.standard_t(1.5, size=(20000, 2))
    S[:, 0] *= 2.0

    # Mix data
    A = np.array([[1, 1], [0, 2]])  # Mixing matrix

    X = np.dot(S, A.T)  # Generate observations

    pca = PCA()
    S_pca_ = pca.fit(X).transform(X)

    ica = FastICA(random_state=rng, whiten="arbitrary-variance")
    S_ica_ = ica.fit(X).transform(X)  # Estimate the sources

    S_ica_ /= S_ica_.std(axis=0)









.. GENERATED FROM PYTHON SOURCE LINES 61-63

Plot results
------------

.. GENERATED FROM PYTHON SOURCE LINES 63-117

.. code-block:: default

    import matplotlib.pyplot as plt


    def plot_samples(S, axis_list=None):
        plt.scatter(
            S[:, 0], S[:, 1], s=2, marker="o", zorder=10, color="steelblue", alpha=0.5
        )
        if axis_list is not None:
            for axis, color, label in axis_list:
                axis /= axis.std()
                x_axis, y_axis = axis
                plt.quiver(
                    (0, 0),
                    (0, 0),
                    x_axis,
                    y_axis,
                    zorder=11,
                    width=0.01,
                    scale=6,
                    color=color,
                    label=label,
                )

        plt.hlines(0, -3, 3)
        plt.vlines(0, -3, 3)
        plt.xlim(-3, 3)
        plt.ylim(-3, 3)
        plt.xlabel("x")
        plt.ylabel("y")


    plt.figure()
    plt.subplot(2, 2, 1)
    plot_samples(S / S.std())
    plt.title("True Independent Sources")

    axis_list = [(pca.components_.T, "orange", "PCA"), (ica.mixing_, "red", "ICA")]
    plt.subplot(2, 2, 2)
    plot_samples(X / np.std(X), axis_list=axis_list)
    legend = plt.legend(loc="lower right")
    legend.set_zorder(100)

    plt.title("Observations")

    plt.subplot(2, 2, 3)
    plot_samples(S_pca_ / np.std(S_pca_, axis=0))
    plt.title("PCA recovered signals")

    plt.subplot(2, 2, 4)
    plot_samples(S_ica_ / np.std(S_ica_))
    plt.title("ICA recovered signals")

    plt.subplots_adjust(0.09, 0.04, 0.94, 0.94, 0.26, 0.36)
    plt.show()



.. image-sg:: /auto_examples/decomposition/images/sphx_glr_plot_ica_vs_pca_001.png
   :alt: True Independent Sources, Observations, PCA recovered signals, ICA recovered signals
   :srcset: /auto_examples/decomposition/images/sphx_glr_plot_ica_vs_pca_001.png
   :class: sphx-glr-single-img






.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.273 seconds)


.. _sphx_glr_download_auto_examples_decomposition_plot_ica_vs_pca.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/1.3.X?urlpath=lab/tree/notebooks/auto_examples/decomposition/plot_ica_vs_pca.ipynb
        :alt: Launch binder
        :width: 150 px



    .. container:: lite-badge

      .. image:: images/jupyterlite_badge_logo.svg
        :target: ../../lite/lab/?path=auto_examples/decomposition/plot_ica_vs_pca.ipynb
        :alt: Launch JupyterLite
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_ica_vs_pca.py <plot_ica_vs_pca.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_ica_vs_pca.ipynb <plot_ica_vs_pca.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_