Keras 3 API 文档 / 操作API / 图像操作

图像操作

[source]

affine_transform function

keras.ops.image.affine_transform(
    images,
    transform,
    interpolation="bilinear",
    fill_mode="constant",
    fill_value=0,
    data_format=None,
)

Applies the given transform(s) to the image(s).

Arguments

  • images: Input image or batch of images. Must be 3D or 4D.
  • transform: Projective transform matrix/matrices. A vector of length 8 or tensor of size N x 8. If one row of transform is [a0, a1, a2, b0, b1, b2, c0, c1], then it maps the output point (x, y) to a transformed input point (x', y') = ((a0 x + a1 y + a2) / k, (b0 x + b1 y + b2) / k), where k = c0 x + c1 y + 1. The transform is inverted compared to the transform mapping input points to output points. Note that gradients are not backpropagated into transformation parameters. Note that c0 and c1 are only effective when using TensorFlow backend and will be considered as 0 when using other backends.
  • interpolation: Interpolation method. Available methods are "nearest", and "bilinear". Defaults to "bilinear".
  • fill_mode: Points outside the boundaries of the input are filled according to the given mode. Available methods are "constant", "nearest", "wrap" and "reflect". Defaults to "constant".
    • "reflect": (d c b a | a b c d | d c b a) The input is extended by reflecting about the edge of the last pixel.
    • "constant": (k k k k | a b c d | k k k k) The input is extended by filling all values beyond the edge with the same constant value k specified by fill_value.
    • "wrap": (a b c d | a b c d | a b c d) The input is extended by wrapping around to the opposite edge.
    • "nearest": (a a a a | a b c d | d d d d) The input is extended by the nearest pixel.
  • fill_value: Value used for points outside the boundaries of the input if fill_mode="constant". Defaults to 0.
  • data_format: A string specifying the data format of the input tensor. It can be either "channels_last" or "channels_first". "channels_last" corresponds to inputs with shape (batch, height, width, channels), while "channels_first" corresponds to inputs with shape (batch, channels, height, width). If not specified, the value will default to keras.config.image_data_format.

Returns

Applied affine transform image or batch of images.

Examples

>>> x = np.random.random((2, 64, 80, 3)) # batch of 2 RGB images
>>> transform = np.array(
...     [
...         [1.5, 0, -20, 0, 1.5, -16, 0, 0],  # zoom
...         [1, 0, -20, 0, 1, -16, 0, 0],  # translation
...     ]
... )
>>> y = keras.ops.image.affine_transform(x, transform)
>>> y.shape
(2, 64, 80, 3)
>>> x = np.random.random((64, 80, 3)) # single RGB image
>>> transform = np.array([1.0, 0.5, -20, 0.5, 1.0, -16, 0, 0])  # shear
>>> y = keras.ops.image.affine_transform(x, transform)
>>> y.shape
(64, 80, 3)
>>> x = np.random.random((2, 3, 64, 80)) # batch of 2 RGB images
>>> transform = np.array(
...     [
...         [1.5, 0, -20, 0, 1.5, -16, 0, 0],  # zoom
...         [1, 0, -20, 0, 1, -16, 0, 0],  # translation
...     ]
... )
>>> y = keras.ops.image.affine_transform(x, transform,
...     data_format="channels_first")
>>> y.shape
(2, 3, 64, 80)

[source]

crop_images function

keras.ops.image.crop_images(
    images,
    top_cropping=None,
    left_cropping=None,
    bottom_cropping=None,
    right_cropping=None,
    target_height=None,
    target_width=None,
    data_format=None,
)

裁剪 images 到指定的 heightwidth.

参数: images: 输入图像或图像批次.必须是3D或4D. top_cropping: 从顶部裁剪的列数. left_cropping: 从左侧裁剪的列数. bottom_cropping: 从底部裁剪的列数. right_cropping: 从右侧裁剪的列数. target_height: 输出图像的高度. target_width: 输出图像的宽度. data_format: 指定输入张量数据格式的字符串. 可以是 "channels_last""channels_first". "channels_last" 对应输入形状 (batch, height, width, channels),而 "channels_first" 对应输入形状 (batch, channels, height, width). 如果未指定,值将默认为 keras.config.image_data_format.

返回: 裁剪后的图像或图像批次.

示例:

>>> images = np.reshape(np.arange(1, 28, dtype="float32"), [3, 3, 3])
>>> images[:,:,0] # 打印图像的第一个通道
array([[ 1.,  4.,  7.],
       [10., 13., 16.],
       [19., 22., 25.]], dtype=float32)
>>> cropped_images = keras.image.crop_images(images, 0, 0, 2, 2)
>>> cropped_images[:,:,0] # 打印裁剪后图像的第一个通道
array([[ 1.,  4.],
       [10., 13.]], dtype=float32)

[source]

extract_patches function

keras.ops.image.extract_patches(
    images, size, strides=None, dilation_rate=1, padding="valid", data_format=None
)

提取图像中的块.

参数: images: 输入图像或图像批次.必须是3D或4D. size: 块大小,可以是整数或元组 (patch_height, patch_width) strides: 沿高度和宽度的步幅.如果未指定或为None,则默认为与size相同的值. dilation_rate: 这是输入步幅,指定输入中两个连续块样本之间的距离.对于非1的值, 步幅必须为1.注意:strides > 1dilation_rate > 1 不支持同时使用. padding: 使用的填充算法类型:"same""valid". data_format: 指定输入张量数据格式的字符串.可以是 "channels_last""channels_first". "channels_last" 对应输入形状 (batch, height, width, channels),而 "channels_first" 对应输入形状 (batch, channels, height, width).如果未指定,值将默认为 keras.config.image_data_format.

返回: 提取的块,3D(如果不是批次的)或4D(如果是批次的)

示例:

>>> image = np.random.random(
...     (2, 20, 20, 3)
... ).astype("float32") # 2张RGB图像的批次
>>> patches = keras.ops.image.extract_patches(image, (5, 5))
>>> patches.shape
(2, 4, 4, 75)
>>> image = np.random.random((20, 20, 3)).astype("float32") # 1张RGB图像
>>> patches = keras.ops.image.extract_patches(image, (3, 3), (1, 1))
>>> patches.shape
(18, 18, 27)

[source]

map_coordinates function

keras.ops.image.map_coordinates(
    inputs, coordinates, order, fill_mode="constant", fill_value=0
)

将输入数组通过插值映射到新的坐标.

注意,在边界附近的插值与scipy函数不同, 因为我们修复了一个突出的bug scipy/issues/2640.

参数: inputs: 输入数组. coordinates: 评估输入的坐标. order: 样条插值的阶数.阶数必须是01.0表示最近邻,1表示线性插值. fill_mode: 根据给定模式填充输入边界外的点.可用方法有"constant""nearest""wrap""mirror""reflect".默认为"constant". - "constant": (k k k k | a b c d | k k k k) 输入数组通过填充超出边缘的所有值来扩展,填充的常数值k由fill_value指定. - "nearest": (a a a a | a b c d | d d d d) 输入数组通过最近的像素扩展. - "wrap": (a b c d | a b c d | a b c d) 输入数组通过环绕到对边来扩展. - "mirror": (c d c b | a b c d | c b a b) 输入数组通过在边缘镜像来扩展. - "reflect": (d c b a | a b c d | d c b a) 输入数组通过在最后一个像素的边缘反射来扩展. fill_value: 如果fill_mode="constant",用于输入边界外点的值.默认为0.

返回: 输出输入或输入批次.


[source]

pad_images function

keras.ops.image.pad_images(
    images,
    top_padding=None,
    left_padding=None,
    bottom_padding=None,
    right_padding=None,
    target_height=None,
    target_width=None,
    data_format=None,
)

用零填充 images 到指定的 heightwidth.

参数: images: 输入图像或图像批次.必须是3D或4D. top_padding: 在顶部添加的零行数. left_padding: 在左侧添加的零列数. bottom_padding: 在底部添加的零行数. right_padding: 在右侧添加的零列数. target_height: 输出图像的高度. target_width: 输出图像的宽度. data_format: 指定输入张量数据格式的字符串. 可以是 "channels_last""channels_first". "channels_last" 对应形状为 (batch, height, width, channels) 的输入,而 "channels_first" 对应形状为 (batch, channels, height, width) 的输入. 如果未指定,值将默认为 keras.config.image_data_format.

返回: 填充后的图像或图像批次.

示例:

>>> images = np.random.random((15, 25, 3))
>>> padded_images = keras.ops.image.pad_images(
...     images, 2, 3, target_height=20, target_width=30
... )
>>> padded_images.shape
(20, 30, 3)
>>> batch_images = np.random.random((2, 15, 25, 3))
>>> padded_batch = keras.ops.image.pad_images(
...     batch_images, 2, 3, target_height=20, target_width=30
... )
>>> padded_batch.shape
(2, 20, 30, 3)

[source]

resize function

keras.ops.image.resize(
    images,
    size,
    interpolation="bilinear",
    antialias=False,
    crop_to_aspect_ratio=False,
    pad_to_aspect_ratio=False,
    fill_mode="constant",
    fill_value=0.0,
    data_format=None,
)

缩放图像到指定大小,使用指定的插值方法.

参数: images: 输入图像或图像批次.必须是3D或4D. size: 输出图像的大小,格式为(高度, 宽度). interpolation: 插值方法.可用方法有"nearest""bilinear""bicubic".默认为"bilinear". antialias: 在缩小图像时是否使用抗锯齿滤镜.默认为False. crop_to_aspect_ratio: 如果为True,在不改变宽高比的情况下调整图像大小. 当原始宽高比与目标宽高比不同时,输出图像将被裁剪,以返回 图像中最大的可能窗口(大小为(高度, 宽度)), 以匹配目标宽高比.默认情况下 (crop_to_aspect_ratio=False),宽高比可能不会被保留. pad_to_aspect_ratio: 如果为True,在不改变宽高比的情况下填充图像. 当原始宽高比与目标宽高比不同时,输出图像将在短边上均匀填充. fill_mode: 当使用pad_to_aspect_ratio=True时,填充区域 根据给定模式填充.目前仅支持"constant" (用常量值填充,等于fill_value). fill_value: 浮点数.当pad_to_aspect_ratio=True时使用的填充值. data_format: 指定输入张量数据格式的字符串. 可以是"channels_last""channels_first". "channels_last"对应输入形状 (批次, 高度, 宽度, 通道),而"channels_first" 对应输入形状(批次, 通道, 高度, 宽度). 如果未指定,默认值为 keras.config.image_data_format.

返回: 缩放后的图像或图像批次.

示例:

>>> x = np.random.random((2, 4, 4, 3)) # 2张RGB图像的批次
>>> y = keras.ops.image.resize(x, (2, 2))
>>> y.shape
(2, 2, 2, 3)
>>> x = np.random.random((4, 4, 3)) # 单张RGB图像
>>> y = keras.ops.image.resize(x, (2, 2))
>>> y.shape
(2, 2, 3)
>>> x = np.random.random((2, 3, 4, 4)) # 2张RGB图像的批次
>>> y = keras.ops.image.resize(x, (2, 2),
...     data_format="channels_first")
>>> y.shape
(2, 3, 2, 2)