sklearn.datasets.make_checkerboard(shape, n_clusters, *, noise=0.0, minval=10, maxval=100, shuffle=True, random_state=None)[source]#

Generate an array with block checkerboard structure for biclustering.

Read more in the User Guide.

shapetuple of shape (n_rows, n_cols)

The shape of the result.

n_clustersint or array-like or shape (n_row_clusters, n_column_clusters)

The number of row and column clusters.

noisefloat, default=0.0

The standard deviation of the gaussian noise.

minvalfloat, default=10

Minimum value of a bicluster.

maxvalfloat, default=100

Maximum value of a bicluster.

shufflebool, default=True

Shuffle the samples.

random_stateint, RandomState instance or None, default=None

Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.

Xndarray of shape shape

The generated array.

rowsndarray of shape (n_clusters, X.shape[0])

The indicators for cluster membership of each row.

colsndarray of shape (n_clusters, X.shape[1])

The indicators for cluster membership of each column.

See also


Generate an array with constant block diagonal structure for biclustering.



Kluger, Y., Basri, R., Chang, J. T., & Gerstein, M. (2003). Spectral biclustering of microarray data: coclustering genes and conditions. Genome research, 13(4), 703-716.


>>> from sklearn.datasets import make_checkerboard
>>> data, rows, columns = make_checkerboard(shape=(300, 300), n_clusters=10,
...                                         random_state=42)
>>> data.shape
(300, 300)
>>> rows.shape
(100, 300)
>>> columns.shape
(100, 300)
>>> print(rows[0][:5], columns[0][:5])
[False False False  True False] [False False False False False]