Note
Go to the end to download the full example code. or to run this example in your browser via Binder
标签传播学习复杂结构#
标签传播学习复杂内部结构的示例,以展示“流形学习”。外圈应标记为“红色”,内圈应标记为“蓝色”。由于两个标签组都位于各自的独特形状内,我们可以看到标签在圆周围正确传播。
# 作者:scikit-learn 开发者
# SPDX许可证标识符:BSD-3-Clause
我们生成了一个包含两个同心圆的数据集。此外,每个数据集样本都关联了一个标签:0(属于外圆),1(属于内圆),以及-1(未知)。在这里,除了两个标签外,所有标签都被标记为未知。
import numpy as np
from sklearn.datasets import make_circles
n_samples = 200
X, y = make_circles(n_samples=n_samples, shuffle=False)
outer, inner = 0, 1
labels = np.full(n_samples, -1.0)
labels[0] = outer
labels[-1] = inner
绘制原始数据
import matplotlib.pyplot as plt
plt.figure(figsize=(4, 4))
plt.scatter(
X[labels == outer, 0],
X[labels == outer, 1],
color="navy",
marker="s",
lw=0,
label="outer labeled",
s=10,
)
plt.scatter(
X[labels == inner, 0],
X[labels == inner, 1],
color="c",
marker="s",
lw=0,
label="inner labeled",
s=10,
)
plt.scatter(
X[labels == -1, 0],
X[labels == -1, 1],
color="darkorange",
marker=".",
label="unlabeled",
)
plt.legend(scatterpoints=1, shadow=False, loc="center")
_ = plt.title("Raw data (2 classes=outer and inner)")
LabelSpreading
的目的是为最初标签未知的样本分配标签。
from sklearn.semi_supervised import LabelSpreading
label_spread = LabelSpreading(kernel="knn", alpha=0.8)
label_spread.fit(X, labels)
现在,我们可以检查在标签未知时每个样本被关联了哪些标签。
output_labels = label_spread.transduction_
output_label_array = np.asarray(output_labels)
outer_numbers = np.where(output_label_array == outer)[0]
inner_numbers = np.where(output_label_array == inner)[0]
plt.figure(figsize=(4, 4))
plt.scatter(
X[outer_numbers, 0],
X[outer_numbers, 1],
color="navy",
marker="s",
lw=0,
s=10,
label="outer learned",
)
plt.scatter(
X[inner_numbers, 0],
X[inner_numbers, 1],
color="c",
marker="s",
lw=0,
s=10,
label="inner learned",
)
plt.legend(scatterpoints=1, shadow=False, loc="center")
plt.title("Labels learned with Label Spreading (KNN)")
plt.show()
Total running time of the script: (0 minutes 0.071 seconds)
Related examples
DBSCAN聚类算法演示
亲和传播聚类算法示例
将希腊硬币的图片分割成多个区域
随机梯度下降:凸损失函数