Fork me on GitHub


sklearn.decomposition.dict_learning_online(X, n_components=2, alpha=1, n_iter=100, return_code=True, dict_init=None, callback=None, batch_size=3, verbose=False, shuffle=True, n_jobs=1, method='lars', iter_offset=0, random_state=None, return_inner_stats=False, inner_stats=None)

Solves a dictionary learning matrix factorization problem online.

Finds the best dictionary and the corresponding sparse code for approximating the data matrix X by solving:

(U^*, V^*) = argmin 0.5 || X - U V ||_2^2 + alpha * || U ||_1
             with || V_k ||_2 = 1 for all  0 <= k < n_components

where V is the dictionary and U is the sparse code. This is accomplished by repeatedly iterating over mini-batches by slicing the input data.


X: array of shape (n_samples, n_features) :

Data matrix.

n_components : int,

Number of dictionary atoms to extract.

alpha : float,

Sparsity controlling parameter.

n_iter : int,

Number of iterations to perform.

return_code : boolean,

Whether to also return the code U or just the dictionary V.

dict_init : array of shape (n_components, n_features),

Initial value for the dictionary for warm restart scenarios.

callback : :

Callable that gets invoked every five iterations.

batch_size : int,

The number of samples to take in each batch.

verbose : :

Degree of output the procedure will print.

shuffle : boolean,

Whether to shuffle the data before splitting it in batches.

n_jobs : int,

Number of parallel jobs to run, or -1 to autodetect.

method : {‘lars’, ‘cd’}

lars: uses the least angle regression method to solve the lasso problem (linear_model.lars_path) cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). Lars will be faster if the estimated components are sparse.

iter_offset : int, default 0

Number of previous iterations completed on the dictionary used for initialization.

random_state : int or RandomState

Pseudo number generator state used for random sampling.

return_inner_stats : boolean, optional

Return the inner statistics A (dictionary covariance) and B (data approximation). Useful to restart the algorithm in an online setting. If return_inner_stats is True, return_code is ignored

inner_stats : tuple of (A, B) ndarrays

Inner sufficient statistics that are kept by the algorithm. Passing them at initialization is useful in online settings, to avoid loosing the history of the evolution. A (n_components, n_components) is the dictionary covariance matrix. B (n_features, n_components) is the data approximation matrix


code : array of shape (n_samples, n_components),

the sparse code (only returned if return_code=True)

dictionary : array of shape (n_components, n_features),

the solutions to the dictionary learning problem