sklearn.metrics
.fowlkes_mallows_score¶
-
sklearn.metrics.
fowlkes_mallows_score
(labels_true, labels_pred, *, sparse=False)[source]¶ Measure the similarity of two clusterings of a set of points.
New in version 0.18.
The Fowlkes-Mallows index (FMI) is defined as the geometric mean between of the precision and recall:
FMI = TP / sqrt((TP + FP) * (TP + FN))
Where
TP
is the number of True Positive (i.e. the number of pair of points that belongs in the same clusters in bothlabels_true
andlabels_pred
),FP
is the number of False Positive (i.e. the number of pair of points that belongs in the same clusters inlabels_true
and not inlabels_pred
) andFN
is the number of False Negative (i.e the number of pair of points that belongs in the same clusters inlabels_pred
and not inlabels_True
).The score ranges from 0 to 1. A high value indicates a good similarity between two clusters.
Read more in the User Guide.
- Parameters
- labels_trueint array, shape = (
n_samples
,) A clustering of the data into disjoint subsets.
- labels_predarray, shape = (
n_samples
, ) A clustering of the data into disjoint subsets.
- sparsebool, default=False
Compute contingency matrix internally with sparse matrix.
- labels_trueint array, shape = (
- Returns
- scorefloat
The resulting Fowlkes-Mallows score.
References
- 1
- 2
Examples
Perfect labelings are both homogeneous and complete, hence have score 1.0:
>>> from sklearn.metrics.cluster import fowlkes_mallows_score >>> fowlkes_mallows_score([0, 0, 1, 1], [0, 0, 1, 1]) 1.0 >>> fowlkes_mallows_score([0, 0, 1, 1], [1, 1, 0, 0]) 1.0
If classes members are completely split across different clusters, the assignment is totally random, hence the FMI is null:
>>> fowlkes_mallows_score([0, 0, 0, 0], [0, 1, 2, 3]) 0.0