.. _sphx_glr_auto_examples_linear_model_plot_sparse_logistic_regression_mnist.py: ===================================================== MNIST classfification using multinomial logistic + L1 ===================================================== Here we fit a multinomial logistic regression with L1 penalty on a subset of the MNIST digits classification task. We use the SAGA algorithm for this purpose: this a solver that is fast when the number of samples is significantly larger than the number of features and is able to finely optimize non-smooth objective functions which is the case with the l1-penalty. Test accuracy reaches > 0.8, while weight vectors remains *sparse* and therefore more easily *interpretable*. Note that this accuracy of this l1-penalized linear model is significantly below what can be reached by an l2-penalized linear model or a non-linear multi-layer perceptron model on this dataset. .. image:: /auto_examples/linear_model/images/sphx_glr_plot_sparse_logistic_regression_mnist_001.png :align: center .. rst-class:: sphx-glr-script-out Out:: ________________________________________________________________________________ [Memory] Calling __main__--home-ubuntu-scikit-learn-examples-linear_model-.fetch_mnist... fetch_mnist() _____________________________________________________fetch_mnist - 39.9s, 0.7min Sparsity with L1 penalty: 80.84% Test score with L1 penalty: 0.8381 Example run in 43.516 s | .. code-block:: python import time import io import matplotlib.pyplot as plt import numpy as np from scipy.io.arff import loadarff from sklearn.datasets import get_data_home from sklearn.externals.joblib import Memory from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.utils import check_random_state try: from urllib.request import urlopen except ImportError: # Python 2 from urllib2 import urlopen print(__doc__) # Author: Arthur Mensch # License: BSD 3 clause # Turn down for faster convergence t0 = time.time() train_samples = 5000 memory = Memory(get_data_home()) @memory.cache() def fetch_mnist(): content = urlopen( 'https://www.openml.org/data/download/52667/mnist_784.arff').read() data, meta = loadarff(io.StringIO(content.decode('utf8'))) data = data.view([('pixels', '` .. container:: sphx-glr-download :download:`Download Jupyter notebook: plot_sparse_logistic_regression_mnist.ipynb ` .. rst-class:: sphx-glr-signature `Generated by Sphinx-Gallery `_