► Keras 3 API 文档 / Keras 应用

Keras 应用

Keras 应用是与预训练权重一起提供的深度学习模型。这些模型可用于预测、特征提取和微调。

实例化模型时，权重会自动下载。它们存储在 ~/.keras/models/ 中。

在实例化时，模型将根据您在 Keras 配置文件 ~/.keras/keras.json 中设置的图像数据格式进行构建。例如，如果您设置了 image_data_format=channels_last，则从此存储库加载的任何模型将根据数据格式约定“高度-宽度-深度”进行构建。

可用模型

模型	大小 (MB)	Top-1 准确率	Top-5 准确率	参数数量	深度	每次推理步骤的时间 (CPU) (毫秒)	每次推理步骤的时间 (GPU) (毫秒)
Xception	88	79.0%	94.5%	22.9M	81	109.4	8.1
VGG16	528	71.3%	90.1%	138.4M	16	69.5	4.2
VGG19	549	71.3%	90.0%	143.7M	19	84.8	4.4
ResNet50	98	74.9%	92.1%	25.6M	107	58.2	4.6
ResNet50V2	98	76.0%	93.0%	25.6M	103	45.6	4.4
ResNet101	171	76.4%	92.8%	44.7M	209	89.6	5.2
ResNet101V2	171	77.2%	93.8%	44.7M	205	72.7	5.4
ResNet152	232	76.6%	93.1%	60.4M	311	127.4	6.5
ResNet152V2	232	78.0%	94.2%	60.4M	307	107.5	6.6
InceptionV3	92	77.9%	93.7%	23.9M	189	42.2	6.9
InceptionResNetV2	215	80.3%	95.3%	55.9M	449	130.2	10.0
MobileNet	16	70.4%	89.5%	4.3M	55	22.6	3.4
MobileNetV2	14	71.3%	90.1%	3.5M	105	25.9	3.8
DenseNet121	33	75.0%	92.3%	8.1M	242	77.1	5.4
DenseNet169	57	76.2%	93.2%	14.3M	338	96.4	6.3
DenseNet201	80	77.3%	93.6%	20.2M	402	127.2	6.7
NASNetMobile	23	74.4%	91.9%	5.3M	389	27.0	6.7
NASNetLarge	343	82.5%	96.0%	88.9M	533	344.5	20.0
EfficientNetB0	29	77.1%	93.3%	5.3M	132	46.0	4.9
EfficientNetB1	31	79.1%	94.4%	7.9M	186	60.2	5.6
EfficientNetB2	36	80.1%	94.9%	9.2M	186	80.8	6.5
EfficientNetB3	48	81.6%	95.7%	12.3M	210	140.0	8.8
EfficientNetB4	75	82.9%	96.4%	19.5M	258	308.3	15.1
EfficientNetB5	118	83.6%	96.7%	30.6M	312	579.2	25.3
EfficientNetB6	166	84.0%	96.8%	43.3M	360	958.1	40.4
EfficientNetB7	256	84.3%	97.0%	66.7M	438	1578.9	61.6
EfficientNetV2B0	29	78.7%	94.3%	7.2M	-	-	-
EfficientNetV2B1	34	79.8%	95.0%	8.2M	-	-	-
EfficientNetV2B2	42	80.5%	95.1%	10.2M	-	-	-
EfficientNetV2B3	59	82.0%	95.8%	14.5M	-	-	-
EfficientNetV2S	88	83.9%	96.7%	21.6M	-	-	-
EfficientNetV2M	220	85.3%	97.4%	54.4M	-	-	-
EfficientNetV2L	479	85.7%	97.5%	119.0M	-	-	-
ConvNeXtTiny	109.42	81.3%	-	28.6M	-	-	-
ConvNeXtSmall	192.29	82.3%	-	50.2M	-	-	-
ConvNeXtBase	338.58	85.3%	-	88.5M	-	-	-
ConvNeXtLarge	755.07	86.3%	-	197.7M	-	-	-
ConvNeXtXLarge	1310	86.7%	-	350.1M	-	-	-

Top-1 和 Top-5 准确率是指模型在 ImageNet 验证数据集上的表现。

深度指网络的拓扑深度。这包括激活层、批量归一化层等。

每次推理步骤的时间是 30 个批次和 10 次重复的平均值。 - CPU: AMD EPYC 处理器（带有 IBPB）（92 核） - RAM: 1.7T - GPU: Tesla A100 - 批量大小: 32

深度计算具有参数的层的数量。

图像分类模型的使用示例

使用 ResNet50 分类 ImageNet 类别

import keras
from keras.applications.resnet50 import ResNet50
from keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
# 将结果解码为元组列表（类别，描述，概率）
# （批次中每个样本一个这样的列表）
print('Predicted:', decode_predictions(preds, top=3)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]

使用 VGG16 提取特征

import keras
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
import numpy as np

model = VGG16(weights='imagenet', include_top=False)

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

features = model.predict(x)

使用 VGG19 从任意中间层提取特征

from keras.applications.vgg19 import VGG19
from keras.applications.vgg19 import preprocess_input
from keras.models import Model
import numpy as np

base_model = VGG19(weights='imagenet')
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output)

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

block4_pool_features = model.predict(x)

在新的类别集上微调 InceptionV3

from keras.applications.inception_v3 import InceptionV3
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D

# 创建基础预训练模型
base_model = InceptionV3(weights='imagenet', include_top=False)

# 添加全局空间平均池化层
x = base_model.output
x = GlobalAveragePooling2D()(x)
# 添加一个全连接层
x = Dense(1024, activation='relu')(x)
# 添加一个逻辑层 -- 假设我们有 200 个类别
predictions = Dense(200, activation='softmax')(x)

# 这是我们将训练的模型
model = Model(inputs=base_model.input, outputs=predictions)

# 首先：只训练顶部层（这些层是随机初始化的）
# 即固定所有卷积 InceptionV3 层
for layer in base_model.layers:
    layer.trainable = False

# 编译模型（应在设置层为不可训练后进行）
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# 在新数据上训练模型几轮
model.fit(...)

# 此时，顶部层训练良好，我们可以开始微调
# 从 inception V3 中的卷积层。我们将冻结底部 N 层
# 并训练剩余的顶部层。

# 可视化层名称和层索引以查看需要冻结多少层
for i, layer in enumerate(base_model.layers):
   print(i, layer.name)

# 我们选择训练前两个 inception 块，即我们将冻结
# 前 249 层并解冻其余层：
for layer in model.layers[:249]:
   layer.trainable = False
for layer in model.layers[249:]:
   layer.trainable = True

# 我们需要重新编译模型以使这些修改生效
# 我们使用学习率较低的 SGD
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')

# 我们再次训练模型（这次微调前两个 inception 块
# 以及顶部的 Dense 层
model.fit(...)

在自定义输入张量上构建 InceptionV3

from keras.applications.inception_v3 import InceptionV3
from keras.layers import Input

# 这也可以是不同 Keras 模型或层的输出
input_tensor = Input(shape=(224, 224, 3))

model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)