`sklearn.linear_model`.Lasso¶

class sklearn.linear_model.Lasso(alpha=1.0, *, fit_intercept=True, normalize='deprecated', precompute=False, copy_X=True, max_iter=1000, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')[source]¶

Linear Model trained with L1 prior as regularizer (aka the Lasso).

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Technically the Lasso model is optimizing the same objective function as the Elastic Net with l1_ratio=1.0 (no L2 penalty).

See also

lars_path: Regularization path using LARS.
lasso_path: Regularization path using Lasso.
LassoLars: Lasso Path along the regularization parameter usingLARS algorithm.
LassoCV: Lasso alpha parameter by cross-validation.
LassoLarsCV: Lasso least angle parameter algorithm by cross-validation.
sklearn.decomposition.sparse_encode: Sparse coding array estimator.

Notes

The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to 1 / (2C) in other linear models such as LogisticRegression or LinearSVC. If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.

The precise stopping criteria based on tol are the following: First, check that that maximum coordinate update, i.e. \(\max_j |w_j^{new} - w_j^{old}|\) is smaller than tol times the maximum absolute coefficient, \(\max_j |w_j|\). If so, then additionally check whether the dual gap is smaller than tol times \(||y||_2^2 / n_{ ext{samples}}\).

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.Lasso(alpha=0.1)
>>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
Lasso(alpha=0.1)
>>> print(clf.coef_)
[0.85 0.  ]
>>> print(clf.intercept_)
0.15...

Methods

`fit`(X, y[, sample_weight, check_input])	Fit model with coordinate descent.
`get_params`([deep])	Get parameters for this estimator.
`path`(X, y, *[, l1_ratio, eps, n_alphas, ...])	Compute elastic net path with coordinate descent.
`predict`(X)	Predict using the linear model.
`score`(X, y[, sample_weight])	Return the coefficient of determination of the prediction.
`set_params`(**params)	Set the parameters of this estimator.

fit(X, y, sample_weight=None, check_input=True)[source]¶

Fit model with coordinate descent.

Parameters:

X{ndarray, sparse matrix} of (n_samples, n_features): Data.
y{ndarray, sparse matrix} of shape (n_samples,) or (n_samples, n_targets): Target. Will be cast to X’s dtype if necessary.
sample_weightfloat or array-like of shape (n_samples,), default=None: Sample weights. Internally, the sample_weight vector will be rescaled to sum to n_samples.

New in version 0.23.
check_inputbool, default=True: Allow to bypass several input checking. Don’t use this parameter unless you know what you do.

Returns:

selfobject: Fitted estimator.

Notes

Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

static path(X, y, *, l1_ratio=0.5, eps=0.001, n_alphas=100, alphas=None, precompute='auto', Xy=None, copy_X=True, coef_init=None, verbose=False, return_n_iter=False, positive=False, check_input=True, **params)[source]¶

Compute elastic net path with coordinate descent.

The elastic net optimization function varies for mono and multi-outputs.

For mono-output tasks it is:

1 / (2 * n_samples) * ||y - Xw||^2_2
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

For multi-output tasks it is:

(1 / (2 * n_samples)) * ||Y - XW||_Fro^2
+ alpha * l1_ratio * ||W||_21
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2

Where:

||W||_21 = \sum_i \sqrt{\sum_j w_{ij}^2}

i.e. the sum of norm of each row.

Examples using `sklearn.linear_model.Lasso`¶

Release Highlights for scikit-learn 0.23

Compressive sensing: tomography reconstruction with L1 prior (Lasso)

Joint feature selection with multi-task Lasso

Lasso and Elastic Net for Sparse Signals

Lasso on dense and sparse data

Cross-validation on diabetes Dataset Exercise

sklearn.linear_model.Lasso¶

Examples using sklearn.linear_model.Lasso¶

`sklearn.linear_model`.Lasso¶

Examples using `sklearn.linear_model.Lasso`¶