sklearn.covariance.GraphicalLasso

class sklearn.covariance.GraphicalLasso(alpha=0.01, *, mode='cd', tol=0.0001, enet_tol=0.0001, max_iter=100, verbose=False, assume_centered=False)[source]

Sparse inverse covariance estimation with an l1-penalized estimator.

Read more in the User Guide.

Changed in version v0.20: GraphLasso has been renamed to GraphicalLasso

Parameters:
alphafloat, default=0.01

The regularization parameter: the higher alpha, the more regularization, the sparser the inverse covariance. Range is (0, inf].

mode{‘cd’, ‘lars’}, default=’cd’

The Lasso solver to use: coordinate descent or LARS. Use LARS for very sparse underlying graphs, where p > n. Elsewhere prefer cd which is more numerically stable.

tolfloat, default=1e-4

The tolerance to declare convergence: if the dual gap goes below this value, iterations are stopped. Range is (0, inf].

enet_tolfloat, default=1e-4

The tolerance for the elastic net solver used to calculate the descent direction. This parameter controls the accuracy of the search direction for a given column update, not of the overall parameter estimate. Only used for mode=’cd’. Range is (0, inf].

max_iterint, default=100

The maximum number of iterations.

verbosebool, default=False

If verbose is True, the objective function and dual gap are plotted at each iteration.

assume_centeredbool, default=False

If True, data are not centered before computation. Useful when working with data whose mean is almost, but not exactly zero. If False, data are centered before computation.

Attributes:
location_ndarray of shape (n_features,)

Estimated location, i.e. the estimated mean.

covariance_ndarray of shape (n_features, n_features)

Estimated covariance matrix

precision_ndarray of shape (n_features, n_features)

Estimated pseudo inverse matrix.

n_iter_int

Number of iterations run.

n_features_in_int

Number of features seen during fit.

New in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

See also

graphical_lasso

L1-penalized covariance estimator.

GraphicalLassoCV

Sparse inverse covariance with cross-validated choice of the l1 penalty.

Examples

>>> import numpy as np
>>> from sklearn.covariance import GraphicalLasso
>>> true_cov = np.array([[0.8, 0.0, 0.2, 0.0],
...                      [0.0, 0.4, 0.0, 0.0],
...                      [0.2, 0.0, 0.3, 0.1],
...                      [0.0, 0.0, 0.1, 0.7]])
>>> np.random.seed(0)
>>> X = np.random.multivariate_normal(mean=[0, 0, 0, 0],
...                                   cov=true_cov,
...                                   size=200)
>>> cov = GraphicalLasso().fit(X)
>>> np.around(cov.covariance_, decimals=3)
array([[0.816, 0.049, 0.218, 0.019],
       [0.049, 0.364, 0.017, 0.034],
       [0.218, 0.017, 0.322, 0.093],
       [0.019, 0.034, 0.093, 0.69 ]])
>>> np.around(cov.location_, decimals=3)
array([0.073, 0.04 , 0.038, 0.143])

Methods

error_norm(comp_cov[, norm, scaling, squared])

Compute the Mean Squared Error between two covariance estimators.

fit(X[, y])

Fit the GraphicalLasso model to X.

get_params([deep])

Get parameters for this estimator.

get_precision()

Getter for the precision matrix.

mahalanobis(X)

Compute the squared Mahalanobis distances of given observations.

score(X_test[, y])

Compute the log-likelihood of X_test under the estimated Gaussian model.

set_params(**params)

Set the parameters of this estimator.

error_norm(comp_cov, norm='frobenius', scaling=True, squared=True)[source]

Compute the Mean Squared Error between two covariance estimators.

Parameters:
comp_covarray-like of shape (n_features, n_features)

The covariance to compare with.

norm{“frobenius”, “spectral”}, default=”frobenius”

The type of norm used to compute the error. Available error types: - ‘frobenius’ (default): sqrt(tr(A^t.A)) - ‘spectral’: sqrt(max(eigenvalues(A^t.A)) where A is the error (comp_cov - self.covariance_).

scalingbool, default=True

If True (default), the squared error norm is divided by n_features. If False, the squared error norm is not rescaled.

squaredbool, default=True

Whether to compute the squared error norm or the error norm. If True (default), the squared error norm is returned. If False, the error norm is returned.

Returns:
resultfloat

The Mean Squared Error (in the sense of the Frobenius norm) between self and comp_cov covariance estimators.

fit(X, y=None)[source]

Fit the GraphicalLasso model to X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Data from which to compute the covariance estimate.

yIgnored

Not used, present for API consistency by convention.

Returns:
selfobject

Returns the instance itself.

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

get_precision()[source]

Getter for the precision matrix.

Returns:
precision_array-like of shape (n_features, n_features)

The precision matrix associated to the current covariance object.

mahalanobis(X)[source]

Compute the squared Mahalanobis distances of given observations.

Parameters:
Xarray-like of shape (n_samples, n_features)

The observations, the Mahalanobis distances of the which we compute. Observations are assumed to be drawn from the same distribution than the data used in fit.

Returns:
distndarray of shape (n_samples,)

Squared Mahalanobis distances of the observations.

score(X_test, y=None)[source]

Compute the log-likelihood of X_test under the estimated Gaussian model.

The Gaussian model is defined by its mean and covariance matrix which are represented respectively by self.location_ and self.covariance_.

Parameters:
X_testarray-like of shape (n_samples, n_features)

Test data of which we compute the likelihood, where n_samples is the number of samples and n_features is the number of features. X_test is assumed to be drawn from the same distribution than the data used in fit (including centering).

yIgnored

Not used, present for API consistency by convention.

Returns:
resfloat

The log-likelihood of X_test with self.location_ and self.covariance_ as estimators of the Gaussian model mean and covariance matrix respectively.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.