.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/cluster/plot_dbscan.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_cluster_plot_dbscan.py>`
        to download the full example code. or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_cluster_plot_dbscan.py:


===================================
DBSCAN聚类算法演示
===================================

DBSCAN(基于密度的噪声应用空间聚类)在高密度区域中找到核心样本,并从中扩展聚类。
该算法适用于包含相似密度聚类的数据。

请参阅示例 :ref:`sphx_glr_auto_examples_cluster_plot_cluster_comparison.py` ,了解不同聚类算法在二维数据集上的演示。

.. GENERATED FROM PYTHON SOURCE LINES 14-18

数据生成
---------------

我们使用 :class:`~sklearn.datasets.make_blobs` 来创建 3 个合成簇。

.. GENERATED FROM PYTHON SOURCE LINES 18-29

.. code-block:: Python


    from sklearn.datasets import make_blobs
    from sklearn.preprocessing import StandardScaler

    centers = [[1, 1], [-1, -1], [1, -1]]
    X, labels_true = make_blobs(
        n_samples=750, centers=centers, cluster_std=0.4, random_state=0
    )

    X = StandardScaler().fit_transform(X)








.. GENERATED FROM PYTHON SOURCE LINES 30-31

我们可以将结果数据可视化:

.. GENERATED FROM PYTHON SOURCE LINES 31-38

.. code-block:: Python



    import matplotlib.pyplot as plt

    plt.scatter(X[:, 0], X[:, 1])
    plt.show()




.. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_dbscan_001.png
   :alt: plot dbscan
   :srcset: /auto_examples/cluster/images/sphx_glr_plot_dbscan_001.png
   :class: sphx-glr-single-img





.. GENERATED FROM PYTHON SOURCE LINES 39-43

计算 DBSCAN
--------------

可以使用 `labels_` 属性访问 :class:`~sklearn.cluster.DBSCAN` 分配的标签。噪声样本被赋予标签 math:`-1` 。

.. GENERATED FROM PYTHON SOURCE LINES 43-59

.. code-block:: Python


    import numpy as np

    from sklearn import metrics
    from sklearn.cluster import DBSCAN

    db = DBSCAN(eps=0.3, min_samples=10).fit(X)
    labels = db.labels_

    # 标签中的簇数量,如果存在噪声则忽略。
    n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)
    n_noise_ = list(labels).count(-1)

    print("Estimated number of clusters: %d" % n_clusters_)
    print("Estimated number of noise points: %d" % n_noise_)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Estimated number of clusters: 3
    Estimated number of noise points: 18




.. GENERATED FROM PYTHON SOURCE LINES 60-67

聚类算法本质上是无监督学习方法。然而,由于 :class:`~sklearn.datasets.make_blobs` 提供了合成簇的真实标签,因此可以使用利用这种“监督”真实信息的评估指标来量化生成簇的质量。这类指标的例子包括同质性、完整性、V-测度、兰德指数、调整兰德指数和调整互信息(AMI)。

如果不知道真实标签,只能使用模型结果本身进行评估。在这种情况下,轮廓系数非常有用。

有关更多信息,请参见
:ref:`sphx_glr_auto_examples_cluster_plot_adjusted_for_chance_measures.py` 
示例或 :ref:`clustering_evaluation` 模块。

.. GENERATED FROM PYTHON SOURCE LINES 67-78

.. code-block:: Python


    print(f"Homogeneity: {metrics.homogeneity_score(labels_true, labels):.3f}")
    print(f"Completeness: {metrics.completeness_score(labels_true, labels):.3f}")
    print(f"V-measure: {metrics.v_measure_score(labels_true, labels):.3f}")
    print(f"Adjusted Rand Index: {metrics.adjusted_rand_score(labels_true, labels):.3f}")
    print(
        "Adjusted Mutual Information:"
        f" {metrics.adjusted_mutual_info_score(labels_true, labels):.3f}"
    )
    print(f"Silhouette Coefficient: {metrics.silhouette_score(X, labels):.3f}")





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Homogeneity: 0.953
    Completeness: 0.883
    V-measure: 0.917
    Adjusted Rand Index: 0.952
    Adjusted Mutual Information: 0.916
    Silhouette Coefficient: 0.626




.. GENERATED FROM PYTHON SOURCE LINES 79-83

绘制结果
------------

核心样本(大点)和非核心样本(小点)根据分配的簇进行颜色编码。被标记为噪声的样本用黑色表示。

.. GENERATED FROM PYTHON SOURCE LINES 83-118

.. code-block:: Python


    unique_labels = set(labels)
    core_samples_mask = np.zeros_like(labels, dtype=bool)
    core_samples_mask[db.core_sample_indices_] = True

    colors = [plt.cm.Spectral(each) for each in np.linspace(0, 1, len(unique_labels))]
    for k, col in zip(unique_labels, colors):
        if k == -1:
            # 黑色用于噪声。
            col = [0, 0, 0, 1]

        class_member_mask = labels == k

        xy = X[class_member_mask & core_samples_mask]
        plt.plot(
            xy[:, 0],
            xy[:, 1],
            "o",
            markerfacecolor=tuple(col),
            markeredgecolor="k",
            markersize=14,
        )

        xy = X[class_member_mask & ~core_samples_mask]
        plt.plot(
            xy[:, 0],
            xy[:, 1],
            "o",
            markerfacecolor=tuple(col),
            markeredgecolor="k",
            markersize=6,
        )

    plt.title(f"Estimated number of clusters: {n_clusters_}")
    plt.show()



.. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_dbscan_002.png
   :alt: Estimated number of clusters: 3
   :srcset: /auto_examples/cluster/images/sphx_glr_plot_dbscan_002.png
   :class: sphx-glr-single-img






.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.082 seconds)


.. _sphx_glr_download_auto_examples_cluster_plot_dbscan.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/cluster/plot_dbscan.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_dbscan.ipynb <plot_dbscan.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_dbscan.py <plot_dbscan.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_dbscan.zip <plot_dbscan.zip>`


.. include:: plot_dbscan.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_