标签传播学习复杂结构#

标签传播学习复杂内部结构的示例,以展示“流形学习”。外圈应标记为“红色”,内圈应标记为“蓝色”。由于两个标签组都位于各自的独特形状内,我们可以看到标签在圆周围正确传播。

# 作者:scikit-learn 开发者
# SPDX许可证标识符:BSD-3-Clause

我们生成了一个包含两个同心圆的数据集。此外,每个数据集样本都关联了一个标签:0(属于外圆),1(属于内圆),以及-1(未知)。在这里,除了两个标签外,所有标签都被标记为未知。

import numpy as np

from sklearn.datasets import make_circles

n_samples = 200
X, y = make_circles(n_samples=n_samples, shuffle=False)
outer, inner = 0, 1
labels = np.full(n_samples, -1.0)
labels[0] = outer
labels[-1] = inner

绘制原始数据

import matplotlib.pyplot as plt

plt.figure(figsize=(4, 4))
plt.scatter(
    X[labels == outer, 0],
    X[labels == outer, 1],
    color="navy",
    marker="s",
    lw=0,
    label="outer labeled",
    s=10,
)
plt.scatter(
    X[labels == inner, 0],
    X[labels == inner, 1],
    color="c",
    marker="s",
    lw=0,
    label="inner labeled",
    s=10,
)
plt.scatter(
    X[labels == -1, 0],
    X[labels == -1, 1],
    color="darkorange",
    marker=".",
    label="unlabeled",
)
plt.legend(scatterpoints=1, shadow=False, loc="center")
_ = plt.title("Raw data (2 classes=outer and inner)")
Raw data (2 classes=outer and inner)

LabelSpreading 的目的是为最初标签未知的样本分配标签。

from sklearn.semi_supervised import LabelSpreading

label_spread = LabelSpreading(kernel="knn", alpha=0.8)
label_spread.fit(X, labels)
LabelSpreading(alpha=0.8, kernel='knn')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


现在,我们可以检查在标签未知时每个样本被关联了哪些标签。

output_labels = label_spread.transduction_
output_label_array = np.asarray(output_labels)
outer_numbers = np.where(output_label_array == outer)[0]
inner_numbers = np.where(output_label_array == inner)[0]

plt.figure(figsize=(4, 4))
plt.scatter(
    X[outer_numbers, 0],
    X[outer_numbers, 1],
    color="navy",
    marker="s",
    lw=0,
    s=10,
    label="outer learned",
)
plt.scatter(
    X[inner_numbers, 0],
    X[inner_numbers, 1],
    color="c",
    marker="s",
    lw=0,
    s=10,
    label="inner learned",
)
plt.legend(scatterpoints=1, shadow=False, loc="center")
plt.title("Labels learned with Label Spreading (KNN)")
plt.show()
Labels learned with Label Spreading (KNN)

Total running time of the script: (0 minutes 0.071 seconds)

Related examples

DBSCAN聚类算法演示

DBSCAN聚类算法演示

亲和传播聚类算法示例

亲和传播聚类算法示例

将希腊硬币的图片分割成多个区域

将希腊硬币的图片分割成多个区域

随机梯度下降:凸损失函数

随机梯度下降:凸损失函数

Gallery generated by Sphinx-Gallery