.. _example_calibration_plot_calibration.py:


======================================
Probability calibration of classifiers
======================================

When performing classification you often want to predict not only
the class label, but also the associated probability. This probability
gives you some kind of confidence on the prediction. However, not all
classifiers provide well-calibrated probabilities, some being over-confident
while others being under-confident. Thus, a separate calibration of predicted
probabilities is often desirable as a postprocessing. This example illustrates
two different methods for this calibration and evaluates the quality of the
returned probabilities using Brier's score
(see http://en.wikipedia.org/wiki/Brier_score).

Compared are the estimated probability using a Gaussian naive Bayes classifier
without calibration, with a sigmoid calibration, and with a non-parametric
isotonic calibration. One can observe that only the non-parametric model is able
to provide a probability calibration that returns probabilities close to the
expected 0.5 for most of the samples belonging to the middle cluster with
heterogeneous labels. This results in a significantly improved Brier score.


.. rst-class:: horizontal


    *

      .. image:: images/plot_calibration_001.png
            :scale: 47

    *

      .. image:: images/plot_calibration_002.png
            :scale: 47


**Script output**::

  Brier scores: (the smaller the better)
  No calibration: 0.104
  With isotonic calibration: 0.085
  With sigmoid calibration: 0.109


**Python source code:** :download:`plot_calibration.py <plot_calibration.py>`

.. literalinclude:: plot_calibration.py
    :lines: 23-

**Total running time of the example:**  0.43 seconds
( 0 minutes  0.43 seconds)