Keras 3 API 文档 / Keras 应用程序 / MobileNet、MobileNetV2 和 MobileNetV3

MobileNet、MobileNetV2 和 MobileNetV3

[source]

MobileNet function

keras.applications.MobileNet(
    input_shape=None,
    alpha=1.0,
    depth_multiplier=1,
    dropout=0.001,
    include_top=True,
    weights="imagenet",
    input_tensor=None,
    pooling=None,
    classes=1000,
    classifier_activation="softmax",
    name=None,
)

实例化 MobileNet 架构.

参考: - MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

此函数返回一个 Keras 图像分类模型, 可以选择加载在 ImageNet 上预训练的权重.

对于图像分类用例,请参阅 此页面以获取详细示例.

对于迁移学习用例,请务必阅读 迁移学习与微调指南.

注意:每个 Keras 应用程序期望特定类型的输入预处理. 对于 MobileNet,在将输入传递给模型之前,调用 keras.applications.mobilenet.preprocess_input. mobilenet.preprocess_input 会将输入像素缩放至 -1 到 1 之间.

参数: input_shape: 可选的形状元组,仅当 include_topFalse 时指定 (否则输入形状必须是 (224, 224, 3) (使用 "channels_last" 数据格式)或 (3, 224, 224) (使用 "channels_first" 数据格式). 它应具有恰好 3 个输入通道,且宽度和高度应不小于 32.例如 (200, 200, 3) 是一个有效值.默认为 None. input_shape 将在提供 input_tensor 时被忽略. alpha: 控制网络的宽度.这在 MobileNet 论文中称为宽度乘数. - 如果 alpha < 1.0,按比例减少每层的过滤器数量. - 如果 alpha > 1.0,按比例增加每层的过滤器数量. - 如果 alpha == 1,使用论文中的默认过滤器数量.默认为 1.0. depth_multiplier: 深度可分离卷积的深度乘数. 这在 MobileNet 论文中称为分辨率乘数.默认为 1.0. dropout: dropout 率.默认为 0.001. include_top: 布尔值,是否在网络顶部包含全连接层.默认为 True. weights: 可以是 None(随机初始化)、"imagenet" (在 ImageNet 上预训练)或要加载的权重文件路径.默认为 "imagenet". input_tensor: 可选的 Keras 张量(即 layers.Input() 的输出) 用作模型的图像输入.input_tensor 在多个不同网络之间共享输入时很有用. 默认为 None. pooling: 当 include_topFalse 时用于特征提取的可选池化模式. - None(默认)表示模型的输出将是最后一个卷积块的 4D 张量输出. - avg 表示将对最后一个卷积块的输出应用全局平均池化, 因此模型的输出将是 2D 张量. - max 表示将应用全局最大池化. classes: 可选的分类数量,仅当 include_topTrue 且未指定 weights 参数时指定.默认为 1000. classifier_activation: 一个 str 或可调用对象.用于 "top" 层的激活函数.除非 include_top=True,否则忽略. 设置 classifier_activation=None 以返回 "top" 层的 logits.加载预训练权重时,classifier_activation 只能是 None"softmax". name: 字符串,模型的名称.

返回: 一个模型实例.


[source]

MobileNetV2 function

keras.applications.MobileNetV2(
    input_shape=None,
    alpha=1.0,
    include_top=True,
    weights="imagenet",
    input_tensor=None,
    pooling=None,
    classes=1000,
    classifier_activation="softmax",
    name=None,
)

实例化 MobileNetV2 架构.

MobileNetV2 与原始 MobileNet 非常相似, 除了它使用带有瓶颈特征的倒置残差块. 它的参数数量比原始 MobileNet 大大减少. MobileNet 支持大于 32 x 32 的任何输入尺寸, 较大的图像尺寸提供更好的性能.

参考: - MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018)

此函数返回一个 Keras 图像分类模型, 可以选择加载在 ImageNet 上预训练的权重.

对于图像分类用例,请参阅 此页面以获取详细示例.

对于迁移学习用例,请务必阅读 迁移学习与微调指南.

注意:每个 Keras 应用程序期望特定类型的输入预处理. 对于 MobileNetV2,在将输入传递给模型之前调用 keras.applications.mobilenet_v2.preprocess_input. mobilenet_v2.preprocess_input 会将输入像素缩放至 -1 和 1 之间.

参数: input_shape: 可选的形状元组,仅当 include_topFalse 时指定 (否则输入形状必须是 (224, 224, 3) (使用 "channels_last" 数据格式)或 (3, 224, 224) (使用 "channels_first" 数据格式). 它应该正好有 3 个输入通道,宽度和高度不应小于 32. 例如 (200, 200, 3) 是一个有效值.默认为 None. input_shape 将在提供 input_tensor 时被忽略. alpha: 控制网络的宽度.这在 MobileNet 论文中称为宽度乘数. - 如果 alpha < 1.0,按比例减少每层的过滤器数量. - 如果 alpha > 1.0,按比例增加每层的过滤器数量. - 如果 alpha == 1,使用论文中的默认过滤器数量.默认为 1.0. include_top: 布尔值,是否在网络顶部包含全连接层.默认为 True. weights: 可以是 None(随机初始化)、"imagenet" (在 ImageNet 上预训练)或要加载的权重文件路径.默认为 "imagenet". input_tensor: 可选的 Keras 张量(即 layers.Input() 的输出) 用作模型的图像输入.input_tensor 对于在多个不同网络之间共享输入很有用. 默认为 None. pooling: 当 include_topFalse 时用于特征提取的可选池化模式. - None(默认)表示模型的输出将是最后一个卷积块的 4D 张量输出. - avg 表示将对最后一个卷积块的输出应用全局平均池化, 因此模型的输出将是 2D 张量. - max 表示将应用全局最大池化. classes: 可选的分类数量,仅当 include_topTrue 且未指定 weights 参数时指定.默认为 1000. classifier_activation: 一个 str 或可调用对象.用于 "顶部" 层的激活函数. 除非 include_top=True,否则忽略. 设置 classifier_activation=None 以返回 "顶部" 层的 logits. 当加载预训练权重时,classifier_activation 只能是 None"softmax". name: 字符串,模型的名称.

返回: 一个模型实例.


[source]

MobileNetV3Small function

keras.applications.MobileNetV3Small(
    input_shape=None,
    alpha=1.0,
    minimalistic=False,
    include_top=True,
    weights="imagenet",
    input_tensor=None,
    classes=1000,
    pooling=None,
    dropout_rate=0.2,
    classifier_activation="softmax",
    include_preprocessing=True,
    name="MobileNetV3Small",
)

Instantiates the MobileNetV3Small architecture.

Reference

The following table describes the performance of MobileNets v3:

MACs stands for Multiply Adds

Classification Checkpoint MACs(M) Parameters(M) Top1 Accuracy Pixel1 CPU(ms)
mobilenet_v3_large_1.0_224 217 5.4 75.6 51.2
mobilenet_v3_large_0.75_224 155 4.0 73.3 39.8
mobilenet_v3_large_minimalistic_1.0_224 209 3.9 72.3 44.1
mobilenet_v3_small_1.0_224 66 2.9 68.1 15.8
mobilenet_v3_small_0.75_224 44 2.4 65.4 12.8
mobilenet_v3_small_minimalistic_1.0_224 65 2.0 61.9 12.2

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note: each Keras Application expects a specific kind of input preprocessing. For MobileNetV3, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus keras.applications.mobilenet_v3.preprocess_input is actually a pass-through function. In this use case, MobileNetV3 models expect their inputs to be float tensors of pixels with values in the [0-255] range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to False. With preprocessing disabled MobileNetV3 models expect their inputs to be float tensors of pixels with values in the [-1, 1] range.

Arguments

  • input_shape: Optional shape tuple, to be specified if you would like to use a model with an input image resolution that is not (224, 224, 3). It should have exactly 3 inputs channels. You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g. (160, 160, 3) would be one valid value.
  • alpha: controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for consistency with MobileNetV1 in Keras.
    • If alpha < 1.0, proportionally decreases the number of filters in each layer.
    • If alpha > 1.0, proportionally increases the number of filters in each layer.
    • If alpha == 1, default number of filters from the paper are used at each layer.
  • minimalistic: In addition to large and small models this module also contains so-called minimalistic models, these models have the same per-layer dimensions characteristic as MobilenetV3 however, they don't utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, and 5x5 convolutions). While these models are less efficient on CPU, they are much more performant on GPU/DSP.
  • include_top: Boolean, whether to include the fully-connected layer at the top of the network. Defaults to True.
  • weights: String, one of None (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.
  • input_tensor: Optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • pooling: String, optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • max means that global max pooling will be applied.
  • classes: Integer, optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.
  • dropout_rate: fraction of the input units to drop on the last layer.
  • classifier_activation: A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=True. Set classifier_activation=None to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be None or "softmax".
  • include_preprocessing: Boolean, whether to include the preprocessing layer (Rescaling) at the bottom of the network. Defaults to True.
  • name: String, the name of the model.

Call arguments

  • inputs: A floating point numpy.array or backend-native tensor, 4D with 3 color channels, with values in the range [0, 255] if include_preprocessing is True and in the range [-1, 1] otherwise.

Returns

A model instance.


[source]

MobileNetV3Large function

keras.applications.MobileNetV3Large(
    input_shape=None,
    alpha=1.0,
    minimalistic=False,
    include_top=True,
    weights="imagenet",
    input_tensor=None,
    classes=1000,
    pooling=None,
    dropout_rate=0.2,
    classifier_activation="softmax",
    include_preprocessing=True,
    name="MobileNetV3Large",
)

Instantiates the MobileNetV3Large architecture.

Reference

The following table describes the performance of MobileNets v3:

MACs stands for Multiply Adds

Classification Checkpoint MACs(M) Parameters(M) Top1 Accuracy Pixel1 CPU(ms)
mobilenet_v3_large_1.0_224 217 5.4 75.6 51.2
mobilenet_v3_large_0.75_224 155 4.0 73.3 39.8
mobilenet_v3_large_minimalistic_1.0_224 209 3.9 72.3 44.1
mobilenet_v3_small_1.0_224 66 2.9 68.1 15.8
mobilenet_v3_small_0.75_224 44 2.4 65.4 12.8
mobilenet_v3_small_minimalistic_1.0_224 65 2.0 61.9 12.2

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note: each Keras Application expects a specific kind of input preprocessing. For MobileNetV3, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus keras.applications.mobilenet_v3.preprocess_input is actually a pass-through function. In this use case, MobileNetV3 models expect their inputs to be float tensors of pixels with values in the [0-255] range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to False. With preprocessing disabled MobileNetV3 models expect their inputs to be float tensors of pixels with values in the [-1, 1] range.

Arguments

  • input_shape: Optional shape tuple, to be specified if you would like to use a model with an input image resolution that is not (224, 224, 3). It should have exactly 3 inputs channels. You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g. (160, 160, 3) would be one valid value.
  • alpha: controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for consistency with MobileNetV1 in Keras.
    • If alpha < 1.0, proportionally decreases the number of filters in each layer.
    • If alpha > 1.0, proportionally increases the number of filters in each layer.
    • If alpha == 1, default number of filters from the paper are used at each layer.
  • minimalistic: In addition to large and small models this module also contains so-called minimalistic models, these models have the same per-layer dimensions characteristic as MobilenetV3 however, they don't utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, and 5x5 convolutions). While these models are less efficient on CPU, they are much more performant on GPU/DSP.
  • include_top: Boolean, whether to include the fully-connected layer at the top of the network. Defaults to True.
  • weights: String, one of None (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.
  • input_tensor: Optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • pooling: String, optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • max means that global max pooling will be applied.
  • classes: Integer, optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.
  • dropout_rate: fraction of the input units to drop on the last layer.
  • classifier_activation: A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=True. Set classifier_activation=None to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be None or "softmax".
  • include_preprocessing: Boolean, whether to include the preprocessing layer (Rescaling) at the bottom of the network. Defaults to True.
  • name: String, the name of the model.

Call arguments

  • inputs: A floating point numpy.array or backend-native tensor, 4D with 3 color channels, with values in the range [0, 255] if include_preprocessing is True and in the range [-1, 1] otherwise.

Returns

A model instance.