Fitting an Elastic Net with a precomputed Gram Matrix and Weighted Samples¶
The following example shows how to precompute the gram matrix while using weighted samples with an ElasticNet.
If weighted samples are used, the design matrix must be centered and then rescaled by the square root of the weight vector before the gram matrix is computed.
sample_weightvector is also rescaled to sum to
n_samples, see the
documentation for the
Let’s start by loading the dataset and creating some sample weights.
import numpy as np from sklearn.datasets import make_regression rng = np.random.RandomState(0) n_samples = int(1e5) X, y = make_regression(n_samples=n_samples, noise=0.5, random_state=rng) sample_weight = rng.lognormal(size=n_samples) # normalize the sample weights normalized_weights = sample_weight * (n_samples / (sample_weight.sum()))
To fit the elastic net using the
precompute option together with the sample
weights, we must first center the design matrix, and rescale it by the
normalized weights prior to computing the gram matrix.
We can now proceed with fitting. We must passed the centered design matrix to
fit otherwise the elastic net estimator will detect that it is uncentered
and discard the gram matrix we passed. However, if we pass the scaled design
matrix, the preprocessing code will incorrectly rescale it a second time.
ElasticNet(alpha=0.01, precompute=array([[ 9.98809919e+04, -4.48938813e+02, -1.03237920e+03, ..., -2.25349312e+02, -3.53959628e+02, -1.67451144e+02], [-4.48938813e+02, 1.00768662e+05, 1.19112072e+02, ..., -1.07963978e+03, 7.47987268e+01, -5.76195467e+02], [-1.03237920e+03, 1.19112072e+02, 1.00393284e+05, ..., -3.07582983e+02, 6.66670169e+02, 2.65799352e+02], ..., [-2.25349312e+02, -1.07963978e+03, -3.07582983e+02, ..., 9.99891212e+04, -4.58195950e+02, -1.58667835e+02], [-3.53959628e+02, 7.47987268e+01, 6.66670169e+02, ..., -4.58195950e+02, 9.98350372e+04, 5.60836363e+02], [-1.67451144e+02, -5.76195467e+02, 2.65799352e+02, ..., -1.58667835e+02, 5.60836363e+02, 1.00911944e+05]]))
Total running time of the script: ( 0 minutes 1.077 seconds)