SpectralEmbedding#

class sklearn.manifold.SpectralEmbedding(n_components=2, *, affinity='nearest_neighbors', gamma=None, random_state=None, eigen_solver=None, eigen_tol='auto', n_neighbors=None, n_jobs=None)#

光谱嵌入用于非线性降维。

通过指定的函数形成亲和矩阵，并对相应的图拉普拉斯矩阵进行谱分解。由此产生的变换由每个数据点的特征向量的值给出。

注意：这里实现的实际算法是拉普拉斯特征映射。

更多信息请参阅用户指南。

Parameters:

n_componentsint, default=2

投影子空间的维度。

affinity{‘nearest_neighbors’, ‘rbf’, ‘precomputed’, ‘precomputed_nearest_neighbors’} or callable, default=’nearest_neighbors’

如何构建亲和矩阵。

‘nearest_neighbors’ : 通过计算最近邻图来构建亲和矩阵。
‘rbf’ : 通过计算径向基函数（RBF）核来构建亲和矩阵。
‘precomputed’ : 将 X 解释为预计算的亲和矩阵。
‘precomputed_nearest_neighbors’ : 将 X 解释为预计算的最近邻稀疏图，并通过选择 n_neighbors 最近邻来构建亲和矩阵。
callable : 使用传入的函数作为亲和度该函数接收数据矩阵（n_samples, n_features）并返回亲和矩阵（n_samples, n_samples）。

gammafloat, default=None

rbf 核的核系数。如果为 None，gamma 将设置为 1/n_features。

random_stateint, RandomState instance or None, default=None

用于在 eigen_solver == 'amg' 时初始化 lobpcg 特征向量分解的伪随机数生成器，以及用于 K-Means 初始化。使用 int 使结果在不同调用中确定（参见 Glossary ）。

Note

当使用 eigen_solver == 'amg' 时，还需要通过 np.random.seed(int) 固定全局 numpy 种子以获得确定性结果。更多信息请参见 pyamg/pyamg#139。

eigen_solver{‘arpack’, ‘lobpcg’, ‘amg’}, default=None

使用的特征值分解策略。AMG 需要安装 pyamg。在非常大、稀疏的问题上可能会更快。如果为 None，则使用 'arpack' 。

eigen_tolfloat, default=”auto”

拉普拉斯矩阵特征分解的停止准则。如果 eigen_tol="auto" ，则传递的容差将取决于 eigen_solver ：

如果 eigen_solver="arpack" ，则 eigen_tol=0.0 ；
如果 eigen_solver="lobpcg" 或 eigen_solver="amg" ，则 eigen_tol=None ，这将配置底层的 lobpcg 求解器根据其启发式自动确定值。详情请参见 scipy.sparse.linalg.lobpcg 。

注意，当使用 eigen_solver="lobpcg" 或 eigen_solver="amg" 时， tol<1e-5 的值可能导致收敛问题，应避免使用。

Added in version 1.2.

n_neighborsint, default=None

用于构建最近邻图的最近邻数量。如果为 None，n_neighbors 将设置为 max(n_samples/10, 1)。

n_jobsint, default=None

并行运行的作业数量。 None 表示 1，除非在 joblib.parallel_backend 上下文中。 -1 表示使用所有处理器。更多详情请参见 Glossary 。

Attributes:

embedding_ndarray of shape (n_samples, n_components): 训练矩阵的光谱嵌入。
affinity_matrix_ndarray of shape (n_samples, n_samples): 通过样本或预计算构建的亲和矩阵。
n_features_in_int: 在 fit 过程中看到的特征数量。

Added in version 0.24.
feature_names_in_ndarray of shape ( n_features_in_ ,): 在 fit 过程中看到的特征名称。仅当 X 的特征名均为字符串时定义。

Added in version 1.0.
n_neighbors_int: 实际使用的最近邻数量。

See also

Isomap: 通过等距映射进行非线性降维。

References

A Tutorial on Spectral Clustering, 2007 Ulrike von Luxburg
On Spectral Clustering: Analysis and an algorithm, 2001 Andrew Y. Ng, Michael I. Jordan, Yair Weiss
Normalized cuts and image segmentation, 2000 Jianbo Shi, Jitendra Malik

Examples

>>> from sklearn.datasets import load_digits
>>> from sklearn.manifold import SpectralEmbedding
>>> X, _ = load_digits(return_X_y=True)
>>> X.shape
(1797, 64)
>>> embedding = SpectralEmbedding(n_components=2)
>>> X_transformed = embedding.fit_transform(X[:100])
>>> X_transformed.shape
(100, 2)

fit(X, y=None)#

拟合从数据X中得到的模型。

Parameters:

X{array-like, sparse matrix}，形状为 (n_samples, n_features)

训练向量，其中 n_samples 是样本的数量而 n_features 是特征的数量。

如果 affinity 是 “precomputed” X : {array-like, sparse matrix}，形状为 (n_samples, n_samples)，将 X 解释为从样本计算得到的预计算邻接图。

y忽略

未使用，为保持API一致性而存在。

Returns:

selfobject: 返回实例本身。

fit_transform(X, y=None)#

拟合模型从数据X中并转换X。

Parameters:

X{array-like, sparse matrix}，形状为 (n_samples, n_features)

训练向量，其中 n_samples 是样本的数量和 n_features 是特征的数量。

如果 affinity 是 “precomputed” X : {array-like, sparse matrix}，形状为 (n_samples, n_samples)，将 X 解释为从样本计算的预计算邻接图。

y忽略

未使用，为保持API一致性而存在。

Returns:

X_newarray-like，形状为 (n_samples, n_components): 训练矩阵的谱嵌入。

get_metadata_routing()#

获取此对象的元数据路由。

请查看用户指南以了解路由机制的工作原理。

Returns:

routingMetadataRequest: MetadataRequest 封装的路由信息。

get_params(deep=True)#

获取此估计器的参数。

Parameters:

deepbool, 默认=True: 如果为True，将返回此估计器和包含的子对象（也是估计器）的参数。

Returns:

paramsdict: 参数名称映射到它们的值。

set_params(**params)#

设置此估计器的参数。

该方法适用于简单估计器以及嵌套对象（例如 Pipeline ）。后者具有形式为 <component>__<parameter> 的参数，以便可以更新嵌套对象的每个组件。

Parameters:

**paramsdict: 估计器参数。

Returns:

selfestimator instance: 估计器实例。

Gallery examples#

二维嵌入数字的各种凝聚聚类

手写数字的流形学习：局部线性嵌入，Isomap…

手写数字的流形学习：局部线性嵌入，Isomap...

流形学习方法的比较

球面上的流形学习方法