`sklearn.linear_model`.ElasticNet¶

class sklearn.linear_model.ElasticNet(alpha=1.0, *, l1_ratio=0.5, fit_intercept=True, precompute=False, max_iter=1000, copy_X=True, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')[source]¶

Linear regression with combined L1 and L2 priors as regularizer.

Minimizes the objective function:

1 / (2 * n_samples) * ||y - Xw||^2_2
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to:

a * ||w||_1 + 0.5 * b * ||w||_2^2

where:

alpha = a + b and l1_ratio = a / (a + b)

The parameter l1_ratio corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. Specifically, l1_ratio = 1 is the lasso penalty. Currently, l1_ratio <= 0.01 is not reliable, unless you supply your own sequence of alpha.

See also

ElasticNetCV: Elastic net model with best model selection by cross-validation.
SGDRegressor: Implements elastic net regression with incremental training.
SGDClassifier: Implements logistic regression with elastic net penalty (SGDClassifier(loss="log_loss", penalty="elasticnet")).

Notes

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

The precise stopping criteria based on tol are the following: First, check that that maximum coordinate update, i.e. \(\max_j |w_j^{new} - w_j^{old}|\) is smaller than tol times the maximum absolute coefficient, \(\max_j |w_j|\). If so, then additionally check whether the dual gap is smaller than tol times \(||y||_2^2 / n_{ ext{samples}}\).

Examples

>>> from sklearn.linear_model import ElasticNet
>>> from sklearn.datasets import make_regression

>>> X, y = make_regression(n_features=2, random_state=0)
>>> regr = ElasticNet(random_state=0)
>>> regr.fit(X, y)
ElasticNet(random_state=0)
>>> print(regr.coef_)
[18.83816048 64.55968825]
>>> print(regr.intercept_)
1.451...
>>> print(regr.predict([[0, 0]]))
[1.451...]

Methods

`fit`(X, y[, sample_weight, check_input])	Fit model with coordinate descent.
`get_params`([deep])	Get parameters for this estimator.
`path`(X, y, *[, l1_ratio, eps, n_alphas, ...])	Compute elastic net path with coordinate descent.
`predict`(X)	Predict using the linear model.
`score`(X, y[, sample_weight])	Return the coefficient of determination of the prediction.
`set_params`(**params)	Set the parameters of this estimator.

fit(X, y, sample_weight=None, check_input=True)[source]¶

Fit model with coordinate descent.

Parameters:

X{ndarray, sparse matrix} of (n_samples, n_features): Data.
y{ndarray, sparse matrix} of shape (n_samples,) or (n_samples, n_targets): Target. Will be cast to X’s dtype if necessary.
sample_weightfloat or array-like of shape (n_samples,), default=None: Sample weights. Internally, the sample_weight vector will be rescaled to sum to n_samples.

New in version 0.23.
check_inputbool, default=True: Allow to bypass several input checking. Don’t use this parameter unless you know what you do.

Returns:

selfobject: Fitted estimator.

Notes

Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

static path(X, y, *, l1_ratio=0.5, eps=0.001, n_alphas=100, alphas=None, precompute='auto', Xy=None, copy_X=True, coef_init=None, verbose=False, return_n_iter=False, positive=False, check_input=True, **params)[source]¶

Compute elastic net path with coordinate descent.

The elastic net optimization function varies for mono and multi-outputs.

For mono-output tasks it is:

1 / (2 * n_samples) * ||y - Xw||^2_2
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

For multi-output tasks it is:

(1 / (2 * n_samples)) * ||Y - XW||_Fro^2
+ alpha * l1_ratio * ||W||_21
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2

Where:

||W||_21 = \sum_i \sqrt{\sum_j w_{ij}^2}

i.e. the sum of norm of each row.

Examples using `sklearn.linear_model.ElasticNet`¶

Release Highlights for scikit-learn 0.23

Fitting an Elastic Net with a precomputed Gram Matrix and Weighted Samples

Lasso and Elastic Net for Sparse Signals

Train error vs Test error

sklearn.linear_model.ElasticNet¶

Examples using sklearn.linear_model.ElasticNet¶

`sklearn.linear_model`.ElasticNet¶

Examples using `sklearn.linear_model.ElasticNet`¶