.. _data_reduction:

=====================================
无监督维度缩减
=====================================

如果你的特征数量很高，那么在监督步骤之前使用无监督步骤进行缩减可能是有用的。许多
:ref:`无监督学习` 方法实现了 ``transform`` 方法，可以用来降低维度。下面我们将讨论两个广泛使用的具体示例。

.. topic:: **流水线**

    无监督数据缩减和监督估计器可以一步链接起来。参见 :ref:`pipeline` 。

.. currentmodule:: sklearn

PCA: 主成分分析
----------------------------------

:class:`decomposition.PCA` 寻找能够很好地捕捉原始特征方差的特征组合。参见 :ref:`decompositions` 。

.. rubric:: 示例

* :ref:`sphx_glr_auto_examples_applications_plot_face_recognition.py` 

随机投影
-------------------

模块: :mod:`~sklearn.random_projection` 提供了几种通过随机投影进行数据缩减的工具。参见文档的相关部分：:ref:`random_projection` 。

.. rubric:: 示例

* :ref:`sphx_glr_auto_examples_miscellaneous_plot_johnson_lindenstrauss_bound.py` 

特征聚合
------------------------

:class:`cluster.FeatureAgglomeration` 应用
:ref:`层次聚类` 来将行为相似的特征分组在一起。

.. rubric:: 示例

* :ref:`sphx_glr_auto_examples_cluster_plot_feature_agglomeration_vs_univariate_selection.py` 
* :ref:`sphx_glr_auto_examples_cluster_plot_digits_agglomeration.py` 

.. topic:: **特征缩放**

   注意，如果特征具有非常不同的缩放或统计特性，:class:`cluster.FeatureAgglomeration` 可能无法捕捉相关特征之间的联系。在这些情况下，使用 :class:`preprocessing.StandardScaler` 可能是有用的。