`sklearn.cluster`.SpectralClustering¶

class sklearn.cluster.SpectralClustering(n_clusters=8, *, eigen_solver=None, n_components=None, random_state=None, n_init=10, gamma=1.0, affinity='rbf', n_neighbors=10, eigen_tol=0.0, assign_labels='kmeans', degree=3, coef0=1, kernel_params=None, n_jobs=None, verbose=False)[source]¶

Apply clustering to a projection of the normalized Laplacian.

In practice Spectral Clustering is very useful when the structure of the individual clusters is highly non-convex, or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster, such as when clusters are nested circles on the 2D plane.

If the affinity matrix is the adjacency matrix of a graph, this method can be used to find normalized graph cuts [1], [2].

When calling fit, an affinity matrix is constructed using either a kernel function such the Gaussian (aka RBF) kernel with Euclidean distance d(X, X):

np.exp(-gamma * d(X,X) ** 2)

or a k-nearest neighbors connectivity matrix.

Alternatively, a user-provided affinity matrix can be specified by setting affinity='precomputed'.

See also

sklearn.cluster.KMeans: K-Means clustering.
sklearn.cluster.DBSCAN: Density-Based Spatial Clustering of Applications with Noise.

Notes

A distance matrix for which 0 indicates identical elements and high values indicate very dissimilar elements can be transformed into an affinity / similarity matrix that is well-suited for the algorithm by applying the Gaussian (aka RBF, heat) kernel:

np.exp(- dist_matrix ** 2 / (2. * delta ** 2))

where delta is a free parameter representing the width of the Gaussian kernel.

An alternative is to take a symmetric version of the k-nearest neighbors connectivity matrix of the points.

If the pyamg package is installed, it is used: this greatly speeds up computation.

References

1: Normalized cuts and image segmentation, 2000 Jianbo Shi, Jitendra Malik
2: A Tutorial on Spectral Clustering, 2007 Ulrike von Luxburg
3: Multiclass spectral clustering, 2003 Stella X. Yu, Jianbo Shi
4: Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method, 2001. A. V. Knyazev SIAM Journal on Scientific Computing 23, no. 2, pp. 517-541.

Examples

>>> from sklearn.cluster import SpectralClustering
>>> import numpy as np
>>> X = np.array([[1, 1], [2, 1], [1, 0],
...               [4, 7], [3, 5], [3, 6]])
>>> clustering = SpectralClustering(n_clusters=2,
...         assign_labels='discretize',
...         random_state=0).fit(X)
>>> clustering.labels_
array([1, 1, 1, 0, 0, 0])
>>> clustering
SpectralClustering(assign_labels='discretize', n_clusters=2,
    random_state=0)

Methods

`fit`(X[, y])	Perform spectral clustering from features, or affinity matrix.
`fit_predict`(X[, y])	Perform spectral clustering on `X` and return cluster labels.
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.

fit(X, y=None)[source]¶

Perform spectral clustering from features, or affinity matrix.

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features) or (n_samples, n_samples): Training instances to cluster, similarities / affinities between instances if affinity='precomputed', or distances between instances if affinity='precomputed_nearest_neighbors. If a sparse matrix is provided in a format other than csr_matrix, csc_matrix, or coo_matrix, it will be converted into a sparse csr_matrix.
yIgnored: Not used, present here for API consistency by convention.

Returns

selfobject: A fitted instance of the estimator.

fit_predict(X, y=None)[source]¶

Perform spectral clustering on X and return cluster labels.

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features) or (n_samples, n_samples): Training instances to cluster, similarities / affinities between instances if affinity='precomputed', or distances between instances if affinity='precomputed_nearest_neighbors. If a sparse matrix is provided in a format other than csr_matrix, csc_matrix, or coo_matrix, it will be converted into a sparse csr_matrix.
yIgnored: Not used, present here for API consistency by convention.

Returns

labelsndarray of shape (n_samples,): Cluster labels.

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

set_params(**params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfestimator instance: Estimator instance.

Examples using `sklearn.cluster.SpectralClustering`¶

sklearn.cluster.SpectralClustering¶

Examples using sklearn.cluster.SpectralClustering¶

`sklearn.cluster`.SpectralClustering¶

Examples using `sklearn.cluster.SpectralClustering`¶