sklearn.metrics.calinski_harabasz_score

sklearn.metrics.calinski_harabasz_score(X, labels)[source]

Compute the Calinski and Harabasz score.

It is also known as the Variance Ratio Criterion.

The score is defined as ratio of the sum of between-cluster dispersion and of within-cluster dispersion.

Read more in the User Guide.

Parameters:
Xarray-like of shape (n_samples, n_features)

A list of n_features-dimensional data points. Each row corresponds to a single data point.

labelsarray-like of shape (n_samples,)

Predicted labels for each sample.

Returns:
scorefloat

The resulting Calinski-Harabasz score.

References

Examples

>>> from sklearn.datasets import make_blobs
>>> from sklearn.cluster import KMeans
>>> from sklearn.metrics import calinski_harabasz_score
>>> X, _ = make_blobs(random_state=0)
>>> kmeans = KMeans(n_clusters=3, random_state=0,).fit(X)
>>> calinski_harabasz_score(X, kmeans.labels_)
114.8...