brier_score_loss#

sklearn.metrics.brier_score_loss(y_true, y_proba, *, sample_weight=None, pos_label=None, labels=None, scale_by_half='auto')[source]#

Compute the Brier score loss.

The smaller the Brier score loss, the better, hence the naming with “loss”. The Brier score measures the mean squared difference between the predicted probability and the actual outcome. The Brier score is a strictly proper scoring rule.

Read more in the User Guide.

Parameters:

y_truearray-like of shape (n_samples,)

True targets.

y_probaarray-like of shape (n_samples,) or (n_samples, n_classes)

Predicted probabilities. If y_proba.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. If y_proba.shape = (n_samples, n_classes) the columns in y_proba are assumed to correspond to the labels in alphabetical order, as done by LabelBinarizer.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

pos_labelint, float, bool or str, default=None

Label of the positive class when y_proba.shape = (n_samples,). If not provided, pos_label will be inferred in the following manner:

if y_true in {-1, 1} or {0, 1}, pos_label defaults to 1;
else if y_true contains string, an error will be raised and pos_label should be explicitly specified;
otherwise, pos_label defaults to the greater label, i.e. np.unique(y_true)[-1].

labelsarray-like of shape (n_classes,), default=None

Class labels when y_proba.shape = (n_samples, n_classes). If not provided, labels will be inferred from y_true.

Added in version 1.7.

scale_by_halfbool or “auto”, default=”auto”

When True, scale the Brier score by 1/2 to lie in the [0, 1] range instead of the [0, 2] range. The default “auto” option implements the rescaling to [0, 1] only for binary classification (as customary) but keeps the original [0, 2] range for multiclass classification.

Added in version 1.7.

Returns:

scorefloat: Brier score loss.

Notes

For \(N\) observations labeled from \(C\) possible classes, the Brier score is defined as:

\[\frac{1}{N}\sum_{i=1}^{N}\sum_{c=1}^{C}(y_{ic} - \hat{p}_{ic})^{2}\]

where \(y_{ic}\) is 1 if observation i belongs to class c, otherwise 0 and \(\hat{p}_{ic}\) is the predicted probability for observation i to belong to class c. The Brier score then ranges between \([0, 2]\).

In binary classification tasks the Brier score is usually divided by two and then ranges between \([0, 1]\). It can be alternatively written as:

\[\frac{1}{N}\sum_{i=1}^{N}(y_{i} - \hat{p}_{i})^{2}\]

where \(y_{i}\) is the binary target and \(\hat{p}_{i}\) is the predicted probability of the positive class.

References

[1]

Wikipedia entry for the Brier score.

Examples

>>> import numpy as np
>>> from sklearn.metrics import brier_score_loss
>>> y_true = np.array([0, 1, 1, 0])
>>> y_true_categorical = np.array(["spam", "ham", "ham", "spam"])
>>> y_prob = np.array([0.1, 0.9, 0.8, 0.3])
>>> brier_score_loss(y_true, y_prob)
0.0375
>>> brier_score_loss(y_true, 1-y_prob, pos_label=0)
0.0375
>>> brier_score_loss(y_true_categorical, y_prob, pos_label="ham")
0.0375
>>> brier_score_loss(y_true, np.array(y_prob) > 0.5)
0.0
>>> brier_score_loss(y_true, y_prob, scale_by_half=False)
0.075
>>> brier_score_loss(
...    ["eggs", "ham", "spam"],
...    [[0.8, 0.1, 0.1], [0.2, 0.7, 0.1], [0.2, 0.2, 0.6]],
...    labels=["eggs", "ham", "spam"]
... )
0.146

Gallery examples#

Probability calibration of classifiers

Probability Calibration curves

Probability Calibration for 3-class classification

Examples of Using FrozenEstimator