► Keras 3 API 文档 / KerasCV / 模型 / 任务 / StableDiffusion 图像生成模型

StableDiffusion 图像生成模型

`StableDiffusion` class

keras_cv.models.StableDiffusion(img_height=512, img_width=512, jit_compile=True)

Keras implementation of Stable Diffusion.

Note that the StableDiffusion API, as well as the APIs of the sub-components of StableDiffusion (e.g. ImageEncoder, DiffusionModel) should be considered unstable at this point. We do not guarantee backwards compatability for future changes to these APIs.

Stable Diffusion is a powerful image generation model that can be used, among other things, to generate pictures according to a short text description (called a "prompt").

Arguments

img_height: int, height of the images to generate, in pixel. Note that only multiples of 128 are supported; the value provided will be rounded to the nearest valid value. Defaults to 512.
img_width: int, width of the images to generate, in pixel. Note that only multiples of 128 are supported; the value provided will be rounded to the nearest valid value. Defaults to 512.
jit_compile: bool, whether to compile the underlying models to XLA. This can lead to a significant speedup on some systems. Defaults to False.

Example

from keras_cv.src.models import StableDiffusion
from PIL import Image

model = StableDiffusion(img_height=512, img_width=512, jit_compile=True)
img = model.text_to_image(
    prompt="A beautiful horse running through a field",
    batch_size=1,  # How many images to generate at once
    num_steps=25,  # Number of iterations (controls image quality)
    seed=123,  # Set this to always get the same image from the same prompt
)
Image.fromarray(img[0]).save("horse.png")
print("saved at horse.png")

References

[source]

`StableDiffusionV2` class

keras_cv.models.StableDiffusionV2(img_height=512, img_width=512, jit_compile=True)

Keras implementation of Stable Diffusion v2.

Note that the StableDiffusion API, as well as the APIs of the sub-components of StableDiffusionV2 (e.g. ImageEncoder, DiffusionModelV2) should be considered unstable at this point. We do not guarantee backwards compatability for future changes to these APIs.

Stable Diffusion is a powerful image generation model that can be used, among other things, to generate pictures according to a short text description (called a "prompt").

Arguments

img_height: int, height of the images to generate, in pixel. Note that only multiples of 128 are supported; the value provided will be rounded to the nearest valid value. Defaults to 512.
img_width: int, width of the images to generate, in pixel. Note that only multiples of 128 are supported; the value provided will be rounded to the nearest valid value. Defaults to 512.
jit_compile: bool, whether to compile the underlying models to XLA. This can lead to a significant speedup on some systems. Defaults to False.

Example

from keras_cv.src.models import StableDiffusionV2
from PIL import Image

model = StableDiffusionV2(img_height=512, img_width=512, jit_compile=True)
img = model.text_to_image(
    prompt="A beautiful horse running through a field",
    batch_size=1,  # How many images to generate at once
    num_steps=25,  # Number of iterations (controls image quality)
    seed=123,  # Set this to always get the same image from the same prompt
)
Image.fromarray(img[0]).save("horse.png")
print("saved at horse.png")

References

[source]

`Decoder` class

keras_cv.models.stable_diffusion.Decoder(
    img_height, img_width, name=None, download_weights=True
)

Sequential 将一系列层组合成一个 Model.

示例:

model = keras.Sequential()
model.add(keras.Input(shape=(16,)))
model.add(keras.layers.Dense(8))

# 注意,你也可以省略初始的 `Input`.
# 在这种情况下,模型在第一次调用训练/评估方法之前没有任何权重（因为它还没有构建）:
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(4))
# model.weights 尚未创建

# 而如果你指定了 `Input`,模型会在你添加层的过程中持续构建:
model = keras.Sequential()
model.add(keras.Input(shape=(16,)))
model.add(keras.layers.Dense(8))
len(model.weights)  # 返回 "2"

# 当使用延迟构建模式（未指定输入形状）时,你可以选择通过调用 `build(batch_input_shape)` 手动构建模型:
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(4))
model.build((None, 16))
len(model.weights)  # 返回 "4"

# 注意,当使用延迟构建模式（未指定输入形状）时,模型会在你第一次调用 `fit`、`eval` 或 `predict`,
# 或者第一次在某些输入数据上调用模型时构建.
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(1))
model.compile(optimizer='sgd', loss='mse')
# 这会在第一次时构建模型:
model.fit(x, y, batch_size=32, epochs=10)

[source]

`DiffusionModel` class

keras_cv.models.stable_diffusion.DiffusionModel(
    img_height, img_width, max_text_length, name=None, download_weights=True
)

一个将层分组到具有训练/推理功能的对象中的模型.

有三种方法可以实例化一个 Model:

使用"Functional API”

你从 Input 开始, 你链式调用层来指定模型的前向传播, 最后,你从输入和输出创建你的模型:

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意:仅支持输入张量的字典、列表和元组.不支持嵌套输入（例如列表的列表或字典的字典）.

还可以通过使用中间张量来创建新的 Functional API 模型.这使你可以快速提取模型的子组件.

示例:

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

注意,backbone 和 activations 模型不是用 keras.Input 对象创建的,而是用源自 keras.Input 对象的张量创建的.在底层,这些模型将共享层和权重,因此用户可以训练 full_model,并使用 backbone 或 activations 进行特征提取.模型的输入和输出可以是张量的嵌套结构,创建的模型是标准的 Functional API 模型,支持所有现有的 API.

通过子类化 `Model` 类

在这种情况下,你应该在 __init__() 中定义你的层,并且你应该在 call() 中实现模型的前向传播.

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果你子类化 Model,你可以选择在 call() 中有一个 training 参数（布尔值）,你可以用它来指定训练和推理中的不同行为:

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

模型创建后,你可以使用 model.compile() 配置模型损失和指标,使用 model.fit() 训练模型,或使用 model.predict() 进行预测.

使用 `Sequential` 类

此外,keras.Sequential 是模型的一个特例,其中模型纯粹是单输入、单输出层的堆叠.

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

[source]

`ImageEncoder` class

keras_cv.models.stable_diffusion.ImageEncoder(download_weights=True)

ImageEncoder is the VAE Encoder for StableDiffusion.

[source]

`NoiseScheduler` class

keras_cv.models.stable_diffusion.NoiseScheduler(
    train_timesteps=1000,
    beta_start=0.0001,
    beta_end=0.02,
    beta_schedule="linear",
    variance_type="fixed_small",
    clip_sample=True,
)

Arguments

train_timesteps: number of diffusion steps used to train the model.
beta_start: the starting `beta` value of inference.
beta_end: the final `beta` value.
beta_schedule: the beta schedule, a mapping from a beta range to a
    sequence of betas for stepping the model. Choose from `linear` or
    `quadratic`.
variance_type: options to clip the variance used when adding noise to
    the de-noised sample. Choose from `fixed_small`, `fixed_small_log`,
    `fixed_large`, `fixed_large_log`, `learned` or `learned_range`.
clip_sample: option to clip predicted sample between -1 and 1 for
    numerical stability.

[source]

`SimpleTokenizer` class

keras_cv.models.stable_diffusion.SimpleTokenizer(bpe_path=None)

[source]

`TextEncoder` class

keras_cv.models.stable_diffusion.TextEncoder(
    max_length, vocab_size=49408, name=None, download_weights=True
)

一个将层分组到具有训练/推理功能的对象中的模型.

有三种方法可以实例化一个 Model:

使用"Functional API”

你从 Input 开始, 你链式调用层来指定模型的前向传播, 最后,你从输入和输出创建你的模型:

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意:仅支持输入张量的字典、列表和元组.不支持嵌套输入（例如列表的列表或字典的字典）.

还可以通过使用中间张量来创建新的 Functional API 模型.这使你可以快速提取模型的子组件.

示例:

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

通过子类化 `Model` 类

在这种情况下,你应该在 __init__() 中定义你的层,并且你应该在 call() 中实现模型的前向传播.

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果你子类化 Model,你可以选择在 call() 中有一个 training 参数（布尔值）,你可以用它来指定训练和推理中的不同行为:

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

模型创建后,你可以使用 model.compile() 配置模型损失和指标,使用 model.fit() 训练模型,或使用 model.predict() 进行预测.

使用 `Sequential` 类

此外,keras.Sequential 是模型的一个特例,其中模型纯粹是单输入、单输出层的堆叠.

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

[source]

`TextEncoderV2` class

keras_cv.models.stable_diffusion.TextEncoderV2(
    max_length, vocab_size=49408, name=None, download_weights=True
)

一个将层分组到具有训练/推理功能的对象中的模型.

有三种方法可以实例化一个 Model:

使用"Functional API”

你从 Input 开始, 你链式调用层来指定模型的前向传播, 最后,你从输入和输出创建你的模型:

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意:仅支持输入张量的字典、列表和元组.不支持嵌套输入（例如列表的列表或字典的字典）.

还可以通过使用中间张量来创建新的 Functional API 模型.这使你可以快速提取模型的子组件.

示例:

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

通过子类化 `Model` 类

在这种情况下,你应该在 __init__() 中定义你的层,并且你应该在 call() 中实现模型的前向传播.

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果你子类化 Model,你可以选择在 call() 中有一个 training 参数（布尔值）,你可以用它来指定训练和推理中的不同行为:

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

模型创建后,你可以使用 model.compile() 配置模型损失和指标,使用 model.fit() 训练模型,或使用 model.predict() 进行预测.

使用 `Sequential` 类

此外,keras.Sequential 是模型的一个特例,其中模型纯粹是单输入、单输出层的堆叠.

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

StableDiffusion 图像生成模型

StableDiffusion class

StableDiffusionV2 class

Decoder class

DiffusionModel class

◆ 使用"Functional API”

◆ 通过子类化 Model 类

◆ 使用 Sequential 类

ImageEncoder class

NoiseScheduler class

Arguments

SimpleTokenizer class

TextEncoder class

◆ 使用"Functional API”

◆ 通过子类化 Model 类

◆ 使用 Sequential 类

TextEncoderV2 class

◆ 使用"Functional API”

◆ 通过子类化 Model 类

◆ 使用 Sequential 类

StableDiffusion 图像生成模型

StableDiffusion class

StableDiffusionV2 class

Decoder class

DiffusionModel class

使用"Functional API”

通过子类化 Model 类

使用 Sequential 类

ImageEncoder class

NoiseScheduler class

Arguments

SimpleTokenizer class

TextEncoder class

使用"Functional API”

通过子类化 Model 类

使用 Sequential 类

TextEncoderV2 class

使用"Functional API”

通过子类化 Model 类

使用 Sequential 类

`StableDiffusion` class

`StableDiffusionV2` class

`Decoder` class

`DiffusionModel` class

通过子类化 `Model` 类

使用 `Sequential` 类

`ImageEncoder` class

`NoiseScheduler` class

`SimpleTokenizer` class

`TextEncoder` class

通过子类化 `Model` 类

使用 `Sequential` 类

`TextEncoderV2` class

通过子类化 `Model` 类

使用 `Sequential` 类