Diffusers中基於Stable Diffusion的哪些影像操作

iSherryZhang發表於2023-02-24

原文網址 : https://www.cnblogs.com/shuezhang/p/17150635.html

基於Stable Diffusion的哪些影像操作們：

Text-To-Image generation：StableDiffusionPipeline
Image-to-Image text guided generation：StableDiffusionImg2ImgPipeline
In-painting: StableDiffusionInpaintPipeline
text-guided image super-resolution: StableDiffusionUpscalePipeline
generate variations from an input image：StableDiffusionImageVariationPipeline
image editing by following text instructions：StableDiffusionInstructPix2PixPipeline
......

輔助函式

import requests
from PIL import Image
from io import BytesIO

def show_images(imgs, rows=1, cols=3):
    assert len(imgs) == rows*cols
    w_ori, h_ori = imgs[0].size
    for img in imgs:
        w_new, h_new = img.size
        if w_new != w_ori or h_new != h_ori:
            w_ori = max(w_ori, w_new)
            h_ori = max(h_ori, h_new)
    
    grid = Image.new('RGB', size=(cols*w_ori, rows*h_ori))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w_ori, i//cols*h_ori))
    return grid

def download_image(url):
    response = requests.get(url)
    return Image.open(BytesIO(response.content)).convert("RGB")

Text-To-Image

根據文字生成影像，在diffusers使用StableDiffusionPipeline實現，必要輸入為prompt，示例程式碼：

from diffusers import StableDiffusionPipeline

image_pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

device = "cuda"
image_pipe.to(device)

prompt = ["a photograph of an astronaut riding a horse"] * 3
out_images = image_pipe(prompt).images
for i, out_image in enumerate(out_images):
    out_image.save("astronaut_rides_horse" + str(i) + ".png")

示例輸出：

Image-To-Image

根據文字prompt和原始影像，生成新的影像。在diffusers中使用StableDiffusionImg2ImgPipeline類實現，可以看到，pipeline的必要輸入有兩個：prompt和init_image。示例程式碼：

import torch
from diffusers import StableDiffusionImg2ImgPipeline

device = "cuda"
model_id_or_path = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)
pipe = pipe.to(device)

url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
init_image = download_image(url)
init_image = init_image.resize((768, 512))

prompt = "A fantasy landscape, trending on artstation"

images = pipe(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5).images

grid_img = show_images([init_image, images[0]], 1, 2)
grid_img.save("fantasy_landscape.png")

示例輸出：

In-painting

給定一個mask影像和一句提示，可編輯給定影像的特定部分。使用StableDiffusionInpaintPipeline來實現，輸入包含三部分：原始影像，mask影像和一個prompt，

示例程式碼：

from diffusers import StableDiffusionInpaintPipeline

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

pipe = StableDiffusionInpaintPipeline.from_pretrained("runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
images = pipe(prompt=prompt, image=init_image, mask_image=mask_image).images
grid_img = show_images([init_image, mask_image, images[0]], 1, 3)
grid_img.save("overture-creations.png")

示例輸出：

Upscale

對低解析度影像進行超解析度，使用StableDiffusionUpscalePipeline來實現，必要輸入為prompt和低解析度影像(low-resolution image)，示例程式碼：

from diffusers import StableDiffusionUpscalePipeline

# load model and scheduler
model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipeline = StableDiffusionUpscalePipeline.from_pretrained(model_id, torch_dtype=torch.float16, cache_dir="./models/")
pipeline = pipeline.to("cuda")

# let's download an  image
url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd2-upscale/low_res_cat.png"
low_res_img = download_image(url)
low_res_img = low_res_img.resize((128, 128))

prompt = "a white cat"
upscaled_image = pipeline(prompt=prompt, image=low_res_img).images[0]
grid_img = show_images([low_res_img, upscaled_image], 1, 2)
grid_img.save("a_white_cat.png")
print("low_res_img size: ", low_res_img.size)
print("upscaled_image size: ", upscaled_image.size)

示例輸出，預設將一個128 x 128的小貓影像超分為一個512 x 512的：

預設是將原始尺寸的長和寬均放大四倍，即：

input: 128 x 128 ==> output: 512 x 512
input: 64 x 256 ==> output: 256 x 1024
...

個人感覺，prompt沒有起什麼作用，隨便寫吧。

關於此模型的詳情，參考。

Instruct-Pix2Pix

重要參考

根據輸入的指令prompt對影像進行編輯，使用StableDiffusionInstructPix2PixPipeline來實現，必要輸入包括prompt和image，示例程式碼如下：

import torch
from diffusers import StableDiffusionInstructPix2PixPipeline

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, cache_dir="./models/")
pipe = pipe.to("cuda")

url = "https://huggingface.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
image = download_image(url)

prompt = "make the mountains snowy"
images = pipe(prompt, image=image, num_inference_steps=20, image_guidance_scale=1.5, guidance_scale=7).images
grid_img = show_images([image, images[0]], 1, 2)
grid_img.save("snowy_mountains.png")

示例輸出：

歡迎 Stable Diffusion 3 加入 🧨 Diffusers
2024-06-17
歡迎 Stable Diffusion 3.5 Large 加入 🧨 Diffusers
2024-11-07
Stable Diffusion中的embedding
2024-04-25
Stable Diffusion中的常用術語解析
2024-04-23
怎麼使用Stable diffusion中的models
2024-05-28
[基礎] Stable Diffusion, High-Resolution Image Synthesis with Latent Diffusion Models
2024-03-14
基於PAI-EAS一鍵部署Stable Diffusion AIGC繪畫
2024-01-23
AIGC
Outpainting with Stable Diffusion on an infinite canvas
2024-08-08
AICanvas
stable diffusion 入門教程
2024-07-30
stable diffusion學習筆記
2024-03-09
筆記
Windows 部署 Stable Diffusion web UI
2024-04-02
WindowsWebUI
Stable Diffusion 小白的入坑鋪墊
2024-08-31
Stable-diffusion WebUI API呼叫方法
2023-10-16
WebUIAPI
如何使用stable diffusion設計logo
2024-05-09
Go
Stable diffusion取樣器詳解
2024-06-04
原來Stable Diffusion是這樣工作的
2024-06-06
零程式碼教你安裝部署Stable Diffusion 3，一鍵生成高質量影像
2024-07-12
ubuntu2204 部署 stable-diffusion-webui
2024-04-08
UbuntuWebUI
如何用 Serverless 一鍵部署 Stable Diffusion？
2023-05-11
Server
Stable Diffusion 生成個性圖片指南
2024-06-23
用StabilityMatrix一鍵安裝Stable Diffusion
2024-07-06
Stable Diffusion WebUI詳細使用指南
2024-05-29
WebUI
線上教程 | 重回霸主地位，Stable Diffusion 3.5 輕鬆生成多元化風格影像
2024-10-31
Stable diffusion中這些重要的引數你一定要會用
2024-04-24
在英特爾 CPU 上加速 Stable Diffusion 推理
2023-04-13
Stable Diffusion WebUI 最新版使用文件整理
2025-01-16
WebUI
AI 繪畫基礎 - 細數 Stable Diffusion 中的各種常用模型【? 魔導士裝備圖鑑】
2023-05-10
AI模型
使用 LoRA 進行 Stable Diffusion 的高效引數微調
2023-02-10
stable diffusion ControlNet使用介紹與進階技巧
2024-07-02
Stable Diffusion解析：探尋AI繪畫背後的科技神秘
2024-02-27
AI
在雲伺服器中部署stable diffusion webui教程。
2024-02-07
伺服器WebUI
stable-diffusion-webui官方版本地安裝教程
2023-10-29
WebUI
Stable Diffusion WebUI 頁面設定: 顯示 VAE CLIP
2024-05-09
WebUI
用免費GPU部署自己的stable-diffusion-學習筆記
2024-03-14
GPU筆記
【EMNLP 2023】面向Stable Diffusion的自動Prompt工程演算法BeautifulPrompt
2023-12-07
演算法
ComfyUI 基礎教程（二） —— Stable Diffusion 文生圖基礎工作流及模型、常用節點介紹
2024-09-06
UI模型
基於 Quanto 和 Diffusers 的記憶體高效 transformer 擴散模型
2024-08-30
記憶體ORM模型
基於極大似然估計方法的diffusion
2024-07-11

Diffusers中基於Stable Diffusion的哪些影像操作

輔助函式

Text-To-Image

Image-To-Image

In-painting

Upscale

Instruct-Pix2Pix

相關文章