SpectralBiclustering#

class sklearn.cluster.SpectralBiclustering(n_clusters=3, *, method='bistochastic', n_components=6, n_best=3, svd_method='randomized', n_svd_vecs=None, mini_batch=False, init='k-means++', n_init=10, random_state=None)[source]#

Spectral biclustering (Kluger, 2003) [1].

Partitions rows and columns under the assumption that the data has an underlying checkerboard structure. For instance, if there are two row partitions and three column partitions, each row will belong to three biclusters, and each column will belong to two biclusters. The outer product of the corresponding row and column label vectors gives this checkerboard structure.

See also

SpectralCoclustering: Clusters rows and columns of an array X to solve the relaxed normalized cut of the bipartite graph created from X.

References

[1]

Kluger, Yuval, et. al., 2003. Spectral biclustering of microarray data: coclustering genes and conditions.

Examples

>>> from sklearn.cluster import SpectralBiclustering
>>> import numpy as np
>>> X = np.array([[1, 1], [2, 1], [1, 0],
...               [4, 7], [3, 5], [3, 6]])
>>> clustering = SpectralBiclustering(n_clusters=2, random_state=0).fit(X)
>>> clustering.row_labels_
array([1, 1, 1, 0, 0, 0], dtype=int32)
>>> clustering.column_labels_
array([1, 0], dtype=int32)
>>> clustering
SpectralBiclustering(n_clusters=2, random_state=0)

For a more detailed example, see A demo of the Spectral Biclustering algorithm

fit(X, y=None)[source]#

Create a biclustering for X.

Parameters:

Xarray-like of shape (n_samples, n_features): Training data.
yIgnored: Not used, present for API consistency by convention.

Returns:

selfobject: SpectralBiclustering instance.

get_indices(i)[source]#

Row and column indices of the i’th bicluster.

Only works if rows_ and columns_ attributes exist.

Parameters:

iint: The index of the cluster.

Returns:

row_indndarray, dtype=np.intp: Indices of rows in the dataset that belong to the bicluster.
col_indndarray, dtype=np.intp: Indices of columns in the dataset that belong to the bicluster.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

get_shape(i)[source]#

Shape of the i’th bicluster.

Parameters:

iint: The index of the cluster.

Returns:

n_rowsint: Number of rows in the bicluster.
n_colsint: Number of columns in the bicluster.

get_submatrix(i, data)[source]#

Return the submatrix corresponding to bicluster i.

Parameters:

iint: The index of the cluster.
dataarray-like of shape (n_samples, n_features): The data.

Returns:

submatrixndarray of shape (n_rows, n_cols): The submatrix corresponding to bicluster i.

Notes

Works with sparse matrices. Only works if rows_ and columns_ attributes exist.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

Gallery examples#

A demo of the Spectral Biclustering algorithm