.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/classification/plot_digits_classification.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_classification_plot_digits_classification.py>`
        to download the full example code or to run this example in your browser via JupyterLite or Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_classification_plot_digits_classification.py:


================================
Recognizing hand-written digits
================================

This example shows how scikit-learn can be used to recognize images of
hand-written digits, from 0-9.

.. GENERATED FROM PYTHON SOURCE LINES 10-21

.. code-block:: Python


    # Author: Gael Varoquaux <gael dot varoquaux at normalesup dot org>
    # License: BSD 3 clause

    # Standard scientific Python imports
    import matplotlib.pyplot as plt

    # Import datasets, classifiers and performance metrics
    from sklearn import datasets, metrics, svm
    from sklearn.model_selection import train_test_split


.. GENERATED FROM PYTHON SOURCE LINES 22-34

Digits dataset
--------------

The digits dataset consists of 8x8
pixel images of digits. The ``images`` attribute of the dataset stores
8x8 arrays of grayscale values for each image. We will use these arrays to
visualize the first 4 images. The ``target`` attribute of the dataset stores
the digit each image represents and this is included in the title of the 4
plots below.

Note: if we were working from image files (e.g., 'png' files), we would load
them using :func:`matplotlib.pyplot.imread`.

.. GENERATED FROM PYTHON SOURCE LINES 34-43

.. code-block:: Python


    digits = datasets.load_digits()

    _, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
    for ax, image, label in zip(axes, digits.images, digits.target):
        ax.set_axis_off()
        ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
        ax.set_title("Training: %i" % label)


.. image-sg:: /auto_examples/classification/images/sphx_glr_plot_digits_classification_001.png
   :alt: Training: 0, Training: 1, Training: 2, Training: 3
   :srcset: /auto_examples/classification/images/sphx_glr_plot_digits_classification_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 44-57

Classification
--------------

To apply a classifier on this data, we need to flatten the images, turning
each 2-D array of grayscale values from shape ``(8, 8)`` into shape
``(64,)``. Subsequently, the entire dataset will be of shape
``(n_samples, n_features)``, where ``n_samples`` is the number of images and
``n_features`` is the total number of pixels in each image.

We can then split the data into train and test subsets and fit a support
vector classifier on the train samples. The fitted classifier can
subsequently be used to predict the value of the digit for the samples
in the test subset.

.. GENERATED FROM PYTHON SOURCE LINES 57-76

.. code-block:: Python


    # flatten the images
    n_samples = len(digits.images)
    data = digits.images.reshape((n_samples, -1))

    # Create a classifier: a support vector classifier
    clf = svm.SVC(gamma=0.001)

    # Split data into 50% train and 50% test subsets
    X_train, X_test, y_train, y_test = train_test_split(
        data, digits.target, test_size=0.5, shuffle=False
    )

    # Learn the digits on the train subset
    clf.fit(X_train, y_train)

    # Predict the value of the digit on the test subset
    predicted = clf.predict(X_test)


.. GENERATED FROM PYTHON SOURCE LINES 77-79

Below we visualize the first 4 test samples and show their predicted
digit value in the title.

.. GENERATED FROM PYTHON SOURCE LINES 79-87

.. code-block:: Python


    _, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
    for ax, image, prediction in zip(axes, X_test, predicted):
        ax.set_axis_off()
        image = image.reshape(8, 8)
        ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
        ax.set_title(f"Prediction: {prediction}")


.. image-sg:: /auto_examples/classification/images/sphx_glr_plot_digits_classification_002.png
   :alt: Prediction: 8, Prediction: 8, Prediction: 4, Prediction: 9
   :srcset: /auto_examples/classification/images/sphx_glr_plot_digits_classification_002.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 88-90

:func:`~sklearn.metrics.classification_report` builds a text report showing
the main classification metrics.

.. GENERATED FROM PYTHON SOURCE LINES 90-96

.. code-block:: Python


    print(
        f"Classification report for classifier {clf}:\n"
        f"{metrics.classification_report(y_test, predicted)}\n"
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Classification report for classifier SVC(gamma=0.001):
                  precision    recall  f1-score   support

               0       1.00      0.99      0.99        88
               1       0.99      0.97      0.98        91
               2       0.99      0.99      0.99        86
               3       0.98      0.87      0.92        91
               4       0.99      0.96      0.97        92
               5       0.95      0.97      0.96        91
               6       0.99      0.99      0.99        91
               7       0.96      0.99      0.97        89
               8       0.94      1.00      0.97        88
               9       0.93      0.98      0.95        92

        accuracy                           0.97       899
       macro avg       0.97      0.97      0.97       899
    weighted avg       0.97      0.97      0.97       899


.. GENERATED FROM PYTHON SOURCE LINES 97-99

We can also plot a :ref:`confusion matrix <confusion_matrix>` of the
true digit values and the predicted digit values.

.. GENERATED FROM PYTHON SOURCE LINES 99-106

.. code-block:: Python


    disp = metrics.ConfusionMatrixDisplay.from_predictions(y_test, predicted)
    disp.figure_.suptitle("Confusion Matrix")
    print(f"Confusion matrix:\n{disp.confusion_matrix}")

    plt.show()


.. image-sg:: /auto_examples/classification/images/sphx_glr_plot_digits_classification_003.png
   :alt: Confusion Matrix
   :srcset: /auto_examples/classification/images/sphx_glr_plot_digits_classification_003.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Confusion matrix:
    [[87  0  0  0  1  0  0  0  0  0]
     [ 0 88  1  0  0  0  0  0  1  1]
     [ 0  0 85  1  0  0  0  0  0  0]
     [ 0  0  0 79  0  3  0  4  5  0]
     [ 0  0  0  0 88  0  0  0  0  4]
     [ 0  0  0  0  0 88  1  0  0  2]
     [ 0  1  0  0  0  0 90  0  0  0]
     [ 0  0  0  0  0  1  0 88  0  0]
     [ 0  0  0  0  0  0  0  0 88  0]
     [ 0  0  0  1  0  1  0  0  0 90]]


.. GENERATED FROM PYTHON SOURCE LINES 107-111

If the results from evaluating a classifier are stored in the form of a
:ref:`confusion matrix <confusion_matrix>` and not in terms of `y_true` and
`y_pred`, one can still build a :func:`~sklearn.metrics.classification_report`
as follows:

.. GENERATED FROM PYTHON SOURCE LINES 111-129

.. code-block:: Python


    # The ground truth and predicted lists
    y_true = []
    y_pred = []
    cm = disp.confusion_matrix

    # For each cell in the confusion matrix, add the corresponding ground truths
    # and predictions to the lists
    for gt in range(len(cm)):
        for pred in range(len(cm)):
            y_true += [gt] * cm[gt][pred]
            y_pred += [pred] * cm[gt][pred]

    print(
        "Classification report rebuilt from confusion matrix:\n"
        f"{metrics.classification_report(y_true, y_pred)}\n"
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Classification report rebuilt from confusion matrix:
                  precision    recall  f1-score   support

               0       1.00      0.99      0.99        88
               1       0.99      0.97      0.98        91
               2       0.99      0.99      0.99        86
               3       0.98      0.87      0.92        91
               4       0.99      0.96      0.97        92
               5       0.95      0.97      0.96        91
               6       0.99      0.99      0.99        91
               7       0.96      0.99      0.97        89
               8       0.94      1.00      0.97        88
               9       0.93      0.98      0.95        92

        accuracy                           0.97       899
       macro avg       0.97      0.97      0.97       899
    weighted avg       0.97      0.97      0.97       899


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.472 seconds)


.. _sphx_glr_download_auto_examples_classification_plot_digits_classification.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/classification/plot_digits_classification.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: lite-badge

      .. image:: images/jupyterlite_badge_logo.svg
        :target: ../../lite/lab/?path=auto_examples/classification/plot_digits_classification.ipynb
        :alt: Launch JupyterLite
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_digits_classification.ipynb <plot_digits_classification.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_digits_classification.py <plot_digits_classification.py>`


.. include:: plot_digits_classification.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_