Google Imagen
Imagen on Vertex AI 将谷歌最先进的图像生成AI能力带给应用开发者。通过Imagen on Vertex AI,应用开发者可以构建下一代AI产品,将用户的想象力转化为高质量的视觉资产,利用AI生成,只需几秒钟。
使用Langchain上的Imagen,您可以执行以下任务
- VertexAIImageGeneratorChat : 仅使用文本提示生成新颖图像(文本到图像AI生成)。
- VertexAIImageEditorChat : 使用文本提示编辑整个上传或生成的图像。
- VertexAIImageCaptioning : 通过视觉描述获取图像的文本描述。
- VertexAIVisualQnAChat : 通过视觉问答(VQA)获取关于图像的答案。
- 注意:目前我们仅支持视觉问答(VQA)的单轮聊天
图像生成
仅使用文本提示生成新颖图像(文本到图像AI生成)
from langchain_core.messages import AIMessage, HumanMessage
from langchain_google_vertexai.vision_models import VertexAIImageGeneratorChat
API Reference:AIMessage | HumanMessage
# Create Image Gentation model Object
generator = VertexAIImageGeneratorChat()
messages = [HumanMessage(content=["a cat at the beach"])]
response = generator.invoke(messages)
# To view the generated Image
generated_image = response.content[0]
import base64
import io
from PIL import Image
# Parse response object to get base64 string for image
img_base64 = generated_image["image_url"]["url"].split(",")[-1]
# Convert base64 string to Image
img = Image.open(io.BytesIO(base64.decodebytes(bytes(img_base64, "utf-8"))))
# view Image
img
图像编辑
使用文本提示编辑整个上传或生成的图像。
编辑生成的图像
from langchain_core.messages import AIMessage, HumanMessage
from langchain_google_vertexai.vision_models import (
VertexAIImageEditorChat,
VertexAIImageGeneratorChat,
)
API Reference:AIMessage | HumanMessage
# Create Image Gentation model Object
generator = VertexAIImageGeneratorChat()
# Provide a text input for image
messages = [HumanMessage(content=["a cat at the beach"])]
# call the model to generate an image
response = generator.invoke(messages)
# read the image object from the response
generated_image = response.content[0]
# Create Image Editor model Object
editor = VertexAIImageEditorChat()
# Write prompt for editing and pass the "generated_image"
messages = [HumanMessage(content=[generated_image, "a dog at the beach "])]
# Call the model for editing Image
editor_response = editor.invoke(messages)
import base64
import io
from PIL import Image
# Parse response object to get base64 string for image
edited_img_base64 = editor_response.content[0]["image_url"]["url"].split(",")[-1]
# Convert base64 string to Image
edited_img = Image.open(
io.BytesIO(base64.decodebytes(bytes(edited_img_base64, "utf-8")))
)
# view Image
edited_img
图像字幕生成
from langchain_google_vertexai import VertexAIImageCaptioning
# Initialize the Image Captioning Object
model = VertexAIImageCaptioning()
注意:我们正在使用图像生成部分中生成的图像
# use image egenarted in Image Generation Section
img_base64 = generated_image["image_url"]["url"]
response = model.invoke(img_base64)
print(f"Generated Cpation : {response}")
# Convert base64 string to Image
img = Image.open(
io.BytesIO(base64.decodebytes(bytes(img_base64.split(",")[-1], "utf-8")))
)
# display Image
img
Generated Cpation : a cat sitting on the beach looking at the camera
视觉问答 (VQA)
from langchain_google_vertexai import VertexAIVisualQnAChat
model = VertexAIVisualQnAChat()
注意:我们正在使用图像生成部分中生成的图像
question = "What animal is shown in the image?"
response = model.invoke(
input=[
HumanMessage(
content=[
{"type": "image_url", "image_url": {"url": img_base64}},
question,
]
)
]
)
print(f"question : {question}\nanswer : {response.content}")
# Convert base64 string to Image
img = Image.open(
io.BytesIO(base64.decodebytes(bytes(img_base64.split(",")[-1], "utf-8")))
)
# display Image
img
question : What animal is shown in the image?
answer : cat