rand_score#

sklearn.metrics.rand_score(labels_true, labels_pred)[source]#

Rand index.

The Rand Index computes a similarity measure between two clusterings by considering all pairs of samples and counting pairs that are assigned in the same or different clusters in the predicted and true clusterings [1] [2].

The raw RI score [3] is:

RI = (number of agreeing pairs) / (number of pairs)

See also

adjusted_rand_score: Adjusted Rand Score.
adjusted_mutual_info_score: Adjusted Mutual Information.

References

[1]

Hubert, L., Arabie, P. “Comparing partitions.” Journal of Classification 2, 193–218 (1985)..

[2]

Wikipedia: Simple Matching Coefficient

[3]

Wikipedia: Rand Index

Examples

Perfectly matching labelings have a score of 1 even

>>> from sklearn.metrics.cluster import rand_score
>>> rand_score([0, 0, 1, 1], [1, 1, 0, 0])
1.0

Labelings that assign all classes members to the same clusters are complete but may not always be pure, hence penalized:

>>> rand_score([0, 0, 1, 2], [0, 0, 1, 1])
0.83

Gallery examples#

Adjustment for chance in clustering performance evaluation