.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/model_selection/plot_randomized_search.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_model_selection_plot_randomized_search.py>`
        to download the full example code or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_model_selection_plot_randomized_search.py:


=========================================================================
Comparing randomized search and grid search for hyperparameter estimation
=========================================================================

Compare randomized search and grid search for optimizing hyperparameters of a
linear SVM with SGD training.
All parameters that influence the learning are searched simultaneously
(except for the number of estimators, which poses a time / quality tradeoff).

The randomized search and the grid search explore exactly the same space of
parameters. The result in parameter settings is quite similar, while the run
time for randomized search is drastically lower.

The performance is may slightly worse for the randomized search, and is likely
due to a noise effect and would not carry over to a held-out test set.

Note that in practice, one would not search over this many different parameters
simultaneously using grid search, but pick only the ones deemed most important.

.. GENERATED FROM PYTHON SOURCE LINES 21-83


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    RandomizedSearchCV took 31.72 seconds for 20 candidates parameter settings.
    Model with rank: 1
    Mean validation score: 0.920 (std: 0.028)
    Parameters: {'alpha': 0.07316411520495676, 'average': False, 'l1_ratio': 0.29007760721044407}

    Model with rank: 2
    Mean validation score: 0.920 (std: 0.029)
    Parameters: {'alpha': 0.0005223493320259539, 'average': True, 'l1_ratio': 0.7936977033574206}

    Model with rank: 3
    Mean validation score: 0.918 (std: 0.031)
    Parameters: {'alpha': 0.00025790124268693137, 'average': True, 'l1_ratio': 0.5699649107012649}

    GridSearchCV took 166.67 seconds for 100 candidate parameter settings.
    Model with rank: 1
    Mean validation score: 0.931 (std: 0.026)
    Parameters: {'alpha': 0.0001, 'average': True, 'l1_ratio': 0.0}

    Model with rank: 2
    Mean validation score: 0.928 (std: 0.030)
    Parameters: {'alpha': 0.0001, 'average': True, 'l1_ratio': 0.1111111111111111}

    Model with rank: 3
    Mean validation score: 0.927 (std: 0.026)
    Parameters: {'alpha': 0.0001, 'average': True, 'l1_ratio': 0.5555555555555556}


|

.. code-block:: default

    print(__doc__)

    import numpy as np

    from time import time
    import scipy.stats as stats
    from sklearn.utils.fixes import loguniform

    from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
    from sklearn.datasets import load_digits
    from sklearn.linear_model import SGDClassifier

    # get some data
    X, y = load_digits(return_X_y=True)

    # build a classifier
    clf = SGDClassifier(loss='hinge', penalty='elasticnet',
                        fit_intercept=True)


    # Utility function to report best scores
    def report(results, n_top=3):
        for i in range(1, n_top + 1):
            candidates = np.flatnonzero(results['rank_test_score'] == i)
            for candidate in candidates:
                print("Model with rank: {0}".format(i))
                print("Mean validation score: {0:.3f} (std: {1:.3f})"
                      .format(results['mean_test_score'][candidate],
                              results['std_test_score'][candidate]))
                print("Parameters: {0}".format(results['params'][candidate]))
                print("")


    # specify parameters and distributions to sample from
    param_dist = {'average': [True, False],
                  'l1_ratio': stats.uniform(0, 1),
                  'alpha': loguniform(1e-4, 1e0)}

    # run randomized search
    n_iter_search = 20
    random_search = RandomizedSearchCV(clf, param_distributions=param_dist,
                                       n_iter=n_iter_search)

    start = time()
    random_search.fit(X, y)
    print("RandomizedSearchCV took %.2f seconds for %d candidates"
          " parameter settings." % ((time() - start), n_iter_search))
    report(random_search.cv_results_)

    # use a full grid over all parameters
    param_grid = {'average': [True, False],
                  'l1_ratio': np.linspace(0, 1, num=10),
                  'alpha': np.power(10, np.arange(-4, 1, dtype=float))}

    # run grid search
    grid_search = GridSearchCV(clf, param_grid=param_grid)
    start = time()
    grid_search.fit(X, y)

    print("GridSearchCV took %.2f seconds for %d candidate parameter settings."
          % (time() - start, len(grid_search.cv_results_['params'])))
    report(grid_search.cv_results_)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 3 minutes  18.475 seconds)


.. _sphx_glr_download_auto_examples_model_selection_plot_randomized_search.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: binder-badge

    .. image:: images/binder_badge_logo.svg
      :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/0.24.X?urlpath=lab/tree/notebooks/auto_examples/model_selection/plot_randomized_search.ipynb
      :alt: Launch binder
      :width: 150 px


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_randomized_search.py <plot_randomized_search.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_randomized_search.ipynb <plot_randomized_search.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_