sklearn.metrics.homogeneity_completeness_v_measure

sklearn.metrics.homogeneity_completeness_v_measure(labels_true, labels_pred, *, beta=1.0)[source]

Compute the homogeneity and completeness and V-Measure scores at once.

Those metrics are based on normalized conditional entropy measures of the clustering labeling to evaluate given the knowledge of a Ground Truth class labels of the same samples.

A clustering result satisfies homogeneity if all of its clusters contain only data points which are members of a single class.

A clustering result satisfies completeness if all the data points that are members of a given class are elements of the same cluster.

Both scores have positive values between 0.0 and 1.0, larger values being desirable.

Those 3 metrics are independent of the absolute values of the labels: a permutation of the class or cluster label values won’t change the score values in any way.

V-Measure is furthermore symmetric: swapping labels_true and label_pred will give the same score. This does not hold for homogeneity and completeness. V-Measure is identical to normalized_mutual_info_score with the arithmetic averaging method.

Read more in the User Guide.

Parameters
labels_trueint array, shape = [n_samples]

ground truth class labels to be used as a reference

labels_predarray-like of shape (n_samples,)

cluster labels to evaluate

betafloat, default=1.0

Ratio of weight attributed to homogeneity vs completeness. If beta is greater than 1, completeness is weighted more strongly in the calculation. If beta is less than 1, homogeneity is weighted more strongly.

Returns
homogeneityfloat

score between 0.0 and 1.0. 1.0 stands for perfectly homogeneous labeling

completenessfloat

score between 0.0 and 1.0. 1.0 stands for perfectly complete labeling

v_measurefloat

harmonic mean of the first two