.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/cluster/plot_digits_linkage.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_cluster_plot_digits_linkage.py>`
        to download the full example code or to run this example in your browser via JupyterLite or Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_cluster_plot_digits_linkage.py:


=============================================================================
Various Agglomerative Clustering on a 2D embedding of digits
=============================================================================

An illustration of various linkage option for agglomerative clustering on
a 2D embedding of the digits dataset.

The goal of this example is to show intuitively how the metrics behave, and
not to find good clusters for the digits. This is why the example works on a
2D embedding.

What this example shows us is the behavior "rich getting richer" of
agglomerative clustering that tends to create uneven cluster sizes.

This behavior is pronounced for the average linkage strategy,
that ends up with a couple of clusters with few datapoints.

The case of single linkage is even more pathologic with a very
large cluster covering most digits, an intermediate size (clean)
cluster with most zero digits and all other clusters being drawn
from noise points around the fringes.

The other linkage strategies lead to more evenly distributed
clusters that are therefore likely to be less sensible to a
random resampling of the dataset.

.. GENERATED FROM PYTHON SOURCE LINES 29-89



.. rst-class:: sphx-glr-horizontal


    *

      .. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_001.png
         :alt: ward linkage
         :srcset: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_001.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_002.png
         :alt: average linkage
         :srcset: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_002.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_003.png
         :alt: complete linkage
         :srcset: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_003.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_004.png
         :alt: single linkage
         :srcset: /auto_examples/cluster/images/sphx_glr_plot_digits_linkage_004.png
         :class: sphx-glr-multi-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Computing embedding
    Done.
    ward :  0.06s
    average :       0.06s
    complete :      0.05s
    single :        0.02s






|

.. code-block:: Python


    # Authors: Gael Varoquaux
    # License: BSD 3 clause (C) INRIA 2014

    from time import time

    import numpy as np
    from matplotlib import pyplot as plt

    from sklearn import datasets, manifold

    digits = datasets.load_digits()
    X, y = digits.data, digits.target
    n_samples, n_features = X.shape

    np.random.seed(0)


    # ----------------------------------------------------------------------
    # Visualize the clustering
    def plot_clustering(X_red, labels, title=None):
        x_min, x_max = np.min(X_red, axis=0), np.max(X_red, axis=0)
        X_red = (X_red - x_min) / (x_max - x_min)

        plt.figure(figsize=(6, 4))
        for digit in digits.target_names:
            plt.scatter(
                *X_red[y == digit].T,
                marker=f"${digit}$",
                s=50,
                c=plt.cm.nipy_spectral(labels[y == digit] / 10),
                alpha=0.5,
            )

        plt.xticks([])
        plt.yticks([])
        if title is not None:
            plt.title(title, size=17)
        plt.axis("off")
        plt.tight_layout(rect=[0, 0.03, 1, 0.95])


    # ----------------------------------------------------------------------
    # 2D embedding of the digits dataset
    print("Computing embedding")
    X_red = manifold.SpectralEmbedding(n_components=2).fit_transform(X)
    print("Done.")

    from sklearn.cluster import AgglomerativeClustering

    for linkage in ("ward", "average", "complete", "single"):
        clustering = AgglomerativeClustering(linkage=linkage, n_clusters=10)
        t0 = time()
        clustering.fit(X_red)
        print("%s :\t%.2fs" % (linkage, time() - t0))

        plot_clustering(X_red, clustering.labels_, "%s linkage" % linkage)


    plt.show()


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 1.559 seconds)


.. _sphx_glr_download_auto_examples_cluster_plot_digits_linkage.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/1.4.X?urlpath=lab/tree/notebooks/auto_examples/cluster/plot_digits_linkage.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: lite-badge

      .. image:: images/jupyterlite_badge_logo.svg
        :target: ../../lite/lab/?path=auto_examples/cluster/plot_digits_linkage.ipynb
        :alt: Launch JupyterLite
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_digits_linkage.ipynb <plot_digits_linkage.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_digits_linkage.py <plot_digits_linkage.py>`


.. include:: plot_digits_linkage.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_