.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/model_selection/plot_randomized_search.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_model_selection_plot_randomized_search.py>`
        to download the full example code. or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_model_selection_plot_randomized_search.py:


=========================================================================
随机搜索与网格搜索在超参数估计中的比较
=========================================================================

比较随机搜索和网格搜索在优化线性SVM超参数时的表现,使用SGD进行训练。
所有影响学习的参数同时进行搜索(除了估计器的数量,因为这涉及时间/质量的权衡)。

随机搜索和网格搜索探索完全相同的参数空间。参数设置的结果非常相似,但随机搜索的运行时间大大缩短。

随机搜索的性能可能略差,这可能是由于噪声效应,并且不会影响到保留的测试集。

请注意,在实际操作中,使用网格搜索时不会同时搜索这么多不同的参数,而是只选择那些被认为最重要的参数。

.. GENERATED FROM PYTHON SOURCE LINES 16-89




.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    RandomizedSearchCV took 2.71 seconds for 15 candidates parameter settings.
    Model with rank: 1
    Mean validation score: 0.983 (std: 0.007)
    Parameters: {'alpha': np.float64(0.014515898657996523), 'average': False, 'l1_ratio': np.float64(0.12831708887652515)}

    Model with rank: 2
    Mean validation score: 0.981 (std: 0.018)
    Parameters: {'alpha': np.float64(0.0825944504273073), 'average': False, 'l1_ratio': np.float64(0.2869934312245894)}

    Model with rank: 3
    Mean validation score: 0.981 (std: 0.015)
    Parameters: {'alpha': np.float64(0.10794028792348503), 'average': False, 'l1_ratio': np.float64(0.29170373035507635)}

    GridSearchCV took 8.70 seconds for 60 candidate parameter settings.
    Model with rank: 1
    Mean validation score: 0.996 (std: 0.007)
    Parameters: {'alpha': np.float64(0.1), 'average': False, 'l1_ratio': np.float64(0.1111111111111111)}

    Model with rank: 2
    Mean validation score: 0.993 (std: 0.007)
    Parameters: {'alpha': np.float64(0.01), 'average': False, 'l1_ratio': np.float64(0.3333333333333333)}

    Model with rank: 3
    Mean validation score: 0.993 (std: 0.009)
    Parameters: {'alpha': np.float64(0.01), 'average': False, 'l1_ratio': np.float64(0.7777777777777777)}







|

.. code-block:: Python


    from time import time

    import numpy as np
    import scipy.stats as stats

    from sklearn.datasets import load_digits
    from sklearn.linear_model import SGDClassifier
    from sklearn.model_selection import GridSearchCV, RandomizedSearchCV

    # 获取一些数据
    # 
    # 
    X, y = load_digits(return_X_y=True, n_class=3)

    # 构建一个分类器
    clf = SGDClassifier(loss="hinge", penalty="elasticnet", fit_intercept=True)


    # 用于报告最佳分数的实用函数
    def report(results, n_top=3):
        for i in range(1, n_top + 1):
            candidates = np.flatnonzero(results["rank_test_score"] == i)
            for candidate in candidates:
                print("Model with rank: {0}".format(i))
                print(
                    "Mean validation score: {0:.3f} (std: {1:.3f})".format(
                        results["mean_test_score"][candidate],
                        results["std_test_score"][candidate],
                    )
                )
                print("Parameters: {0}".format(results["params"][candidate]))
                print("")


    # 指定参数和分布以进行采样
    param_dist = {
        "average": [True, False],
        "l1_ratio": stats.uniform(0, 1),
        "alpha": stats.loguniform(1e-2, 1e0),
    }

    # 运行随机搜索
    n_iter_search = 15
    random_search = RandomizedSearchCV(
        clf, param_distributions=param_dist, n_iter=n_iter_search
    )

    start = time()
    random_search.fit(X, y)
    print(
        "RandomizedSearchCV took %.2f seconds for %d candidates parameter settings."
        % ((time() - start), n_iter_search)
    )
    report(random_search.cv_results_)

    # 使用所有参数的完整网格
    param_grid = {
        "average": [True, False],
        "l1_ratio": np.linspace(0, 1, num=10),
        "alpha": np.power(10, np.arange(-2, 1, dtype=float)),
    }

    # 运行网格搜索
    grid_search = GridSearchCV(clf, param_grid=param_grid)
    start = time()
    grid_search.fit(X, y)

    print(
        "GridSearchCV took %.2f seconds for %d candidate parameter settings."
        % (time() - start, len(grid_search.cv_results_["params"]))
    )
    report(grid_search.cv_results_)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 11.416 seconds)


.. _sphx_glr_download_auto_examples_model_selection_plot_randomized_search.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/model_selection/plot_randomized_search.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_randomized_search.ipynb <plot_randomized_search.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_randomized_search.py <plot_randomized_search.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_randomized_search.zip <plot_randomized_search.zip>`


.. include:: plot_randomized_search.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_