The Rand Index computes a similarity measure between two clusterings by considering all pairs of samples and counting pairs that are assigned in the same or different clusters in the predicted and true clusterings.
The raw RI score is:
RI = (number of agreeing pairs) / (number of pairs)
Read more in the User Guide.
- labels_truearray-like of shape (n_samples,), dtype=integral
Ground truth class labels to be used as a reference.
- labels_predarray-like of shape (n_samples,), dtype=integral
Cluster labels to evaluate.
Similarity score between 0.0 and 1.0, inclusive, 1.0 stands for perfect match.
Perfectly matching labelings have a score of 1 even
>>> from sklearn.metrics.cluster import rand_score >>> rand_score([0, 0, 1, 1], [1, 1, 0, 0]) 1.0
Labelings that assign all classes members to the same clusters are complete but may not always be pure, hence penalized:
>>> rand_score([0, 0, 1, 2], [0, 0, 1, 1]) 0.83...