sklearn.metrics.homogeneity_completeness_v_measure

sklearn.metrics.homogeneity_completeness_v_measure(labels_true, labels_pred, *, beta=1.0)[source]

Compute the homogeneity and completeness and V-Measure scores at once.

Those metrics are based on normalized conditional entropy measures of the clustering labeling to evaluate given the knowledge of a Ground Truth class labels of the same samples.

A clustering result satisfies homogeneity if all of its clusters contain only data points which are members of a single class.

A clustering result satisfies completeness if all the data points that are members of a given class are elements of the same cluster.

Both scores have positive values between 0.0 and 1.0, larger values being desirable.

Those 3 metrics are independent of the absolute values of the labels: a permutation of the class or cluster label values won’t change the score values in any way.

V-Measure is furthermore symmetric: swapping labels_true and label_pred will give the same score. This does not hold for homogeneity and completeness. V-Measure is identical to normalized_mutual_info_score with the arithmetic averaging method.

Read more in the User Guide.

Parameters:
labels_truearray-like of shape (n_samples,)

Ground truth class labels to be used as a reference.

labels_predarray-like of shape (n_samples,)

Gluster labels to evaluate.

betafloat, default=1.0

Ratio of weight attributed to homogeneity vs completeness. If beta is greater than 1, completeness is weighted more strongly in the calculation. If beta is less than 1, homogeneity is weighted more strongly.

Returns:
homogeneityfloat

Score between 0.0 and 1.0. 1.0 stands for perfectly homogeneous labeling.

completenessfloat

Score between 0.0 and 1.0. 1.0 stands for perfectly complete labeling.

v_measurefloat

Harmonic mean of the first two.

See also

homogeneity_score

Homogeneity metric of cluster labeling.

completeness_score

Completeness metric of cluster labeling.

v_measure_score

V-Measure (NMI with arithmetic mean option).

Examples

>>> from sklearn.metrics import homogeneity_completeness_v_measure
>>> y_true, y_pred = [0, 0, 1, 1, 2, 2], [0, 0, 1, 2, 2, 2]
>>> homogeneity_completeness_v_measure(y_true, y_pred)
(0.71..., 0.77..., 0.73...)