sklearn.metrics.jaccard_score(y_true, y_pred, *, labels=None, pos_label=1, average='binary', sample_weight=None)[source]

Jaccard similarity coefficient score

The Jaccard index [1], or Jaccard similarity coefficient, defined as the size of the intersection divided by the size of the union of two label sets, is used to compare set of predicted labels for a sample to the corresponding set of labels in y_true.

Read more in the User Guide.

y_true1d array-like, or label indicator array / sparse matrix

Ground truth (correct) labels.

y_pred1d array-like, or label indicator array / sparse matrix

Predicted labels, as returned by a classifier.

labelslist, optional

The set of labels to include when average != 'binary', and their order if average is None. Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in y_true and y_pred are used in sorted order.

pos_labelstr or int, 1 by default

The class to report if average='binary' and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting labels=[pos_label] and average != 'binary' will report scores for that label only.

averagestring, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]

If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:


Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.


Calculate metrics globally by counting the total true positives, false negatives and false positives.


Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.


Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance.


Calculate metrics for each instance, and find their average (only meaningful for multilabel classification).

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

scorefloat (if average is not None) or array of floats, shape = [n_unique_labels]


jaccard_score may be a poor metric if there are no positives for some samples or classes. Jaccard is undefined if there are no true or predicted labels, and our implementation will return a score of 0 with a warning.



Wikipedia entry for the Jaccard index


>>> import numpy as np
>>> from sklearn.metrics import jaccard_score
>>> y_true = np.array([[0, 1, 1],
...                    [1, 1, 0]])
>>> y_pred = np.array([[1, 1, 1],
...                    [1, 0, 0]])

In the binary case:

>>> jaccard_score(y_true[0], y_pred[0])

In the multilabel case:

>>> jaccard_score(y_true, y_pred, average='samples')
>>> jaccard_score(y_true, y_pred, average='macro')
>>> jaccard_score(y_true, y_pred, average=None)
array([0.5, 0.5, 1. ])

In the multiclass case:

>>> y_pred = [0, 2, 1, 2]
>>> y_true = [0, 1, 2, 2]
>>> jaccard_score(y_true, y_pred, average=None)
array([1. , 0. , 0.33...])

Examples using sklearn.metrics.jaccard_score