sklearn.datasets.make_sparse_coded_signal(n_samples, *, n_components, n_features, n_nonzero_coefs, random_state=None)[source]#

Generate a signal as a sparse combination of dictionary elements.

Returns matrices Y, D and X such that Y = XD where X is of shape (n_samples, n_components), D is of shape (n_components, n_features), and each row of X has exactly n_nonzero_coefs non-zero elements.

Read more in the User Guide.


Number of samples to generate.


Number of components in the dictionary.


Number of features of the dataset to generate.


Number of active (non-zero) coefficients in each sample.

random_stateint, RandomState instance or None, default=None

Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.

datandarray of shape (n_samples, n_features)

The encoded signal (Y).

dictionaryndarray of shape (n_components, n_features)

The dictionary with normalized components (D).

codendarray of shape (n_samples, n_components)

The sparse code such that each column of this matrix has exactly n_nonzero_coefs non-zero items (X).


>>> from sklearn.datasets import make_sparse_coded_signal
>>> data, dictionary, code = make_sparse_coded_signal(
...     n_samples=50,
...     n_components=100,
...     n_features=10,
...     n_nonzero_coefs=4,
...     random_state=0
... )
>>> data.shape
(50, 10)
>>> dictionary.shape
(100, 10)
>>> code.shape
(50, 100)