Spaces:
Runtime error
Runtime error
<!--Copyright 2023 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# Text-guided image-inpainting | |
[[open-in-colab]] | |
The [`StableDiffusionInpaintPipeline`] allows you to edit specific parts of an image by providing a mask and a text prompt. It uses a version of Stable Diffusion, like [`runwayml/stable-diffusion-inpainting`](https://huggingface.co/runwayml/stable-diffusion-inpainting) specifically trained for inpainting tasks. | |
Get started by loading an instance of the [`StableDiffusionInpaintPipeline`]: | |
```python | |
import PIL | |
import requests | |
import torch | |
from io import BytesIO | |
from diffusers import StableDiffusionInpaintPipeline | |
pipeline = StableDiffusionInpaintPipeline.from_pretrained( | |
"runwayml/stable-diffusion-inpainting", | |
torch_dtype=torch.float16, | |
use_safetensors=True, | |
variant="fp16", | |
) | |
pipeline = pipeline.to("cuda") | |
``` | |
Download an image and a mask of a dog which you'll eventually replace: | |
```python | |
def download_image(url): | |
response = requests.get(url) | |
return PIL.Image.open(BytesIO(response.content)).convert("RGB") | |
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" | |
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" | |
init_image = download_image(img_url).resize((512, 512)) | |
mask_image = download_image(mask_url).resize((512, 512)) | |
``` | |
Now you can create a prompt to replace the mask with something else: | |
```python | |
prompt = "Face of a yellow cat, high resolution, sitting on a park bench" | |
image = pipeline(prompt=prompt, image=init_image, mask_image=mask_image).images[0] | |
``` | |
`image` | `mask_image` | `prompt` | output | | |
:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:| | |
<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="250"/> | <img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" alt="drawing" width="250"/> | ***Face of a yellow cat, high resolution, sitting on a park bench*** | <img src="https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/in_paint/yellow_cat_sitting_on_a_park_bench.png" alt="drawing" width="250"/> | | |
<Tip warning={true}> | |
A previous experimental implementation of inpainting used a different, lower-quality process. To ensure backwards compatibility, loading a pretrained pipeline that doesn't contain the new model will still apply the old inpainting method. | |
</Tip> | |
Check out the Spaces below to try out image inpainting yourself! | |
<iframe | |
src="https://runwayml-stable-diffusion-inpainting.hf.space" | |
frameborder="0" | |
width="850" | |
height="500" | |
></iframe> | |
## Preserving the Unmasked Area of the Image | |
Generally speaking, [`StableDiffusionInpaintPipeline`] (and other inpainting pipelines) will change the unmasked part of the image as well. If this behavior is undesirable, you can force the unmasked area to remain the same as follows: | |
```python | |
import PIL | |
import numpy as np | |
import torch | |
from diffusers import StableDiffusionInpaintPipeline | |
from diffusers.utils import load_image | |
device = "cuda" | |
pipeline = StableDiffusionInpaintPipeline.from_pretrained( | |
"runwayml/stable-diffusion-inpainting", | |
torch_dtype=torch.float16, | |
) | |
pipeline = pipeline.to(device) | |
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" | |
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" | |
init_image = load_image(img_url).resize((512, 512)) | |
mask_image = load_image(mask_url).resize((512, 512)) | |
prompt = "Face of a yellow cat, high resolution, sitting on a park bench" | |
repainted_image = pipeline(prompt=prompt, image=init_image, mask_image=mask_image).images[0] | |
repainted_image.save("repainted_image.png") | |
# Convert mask to grayscale NumPy array | |
mask_image_arr = np.array(mask_image.convert("L")) | |
# Add a channel dimension to the end of the grayscale mask | |
mask_image_arr = mask_image_arr[:, :, None] | |
# Binarize the mask: 1s correspond to the pixels which are repainted | |
mask_image_arr = mask_image_arr.astype(np.float32) / 255.0 | |
mask_image_arr[mask_image_arr < 0.5] = 0 | |
mask_image_arr[mask_image_arr >= 0.5] = 1 | |
# Take the masked pixels from the repainted image and the unmasked pixels from the initial image | |
unmasked_unchanged_image_arr = (1 - mask_image_arr) * init_image + mask_image_arr * repainted_image | |
unmasked_unchanged_image = PIL.Image.fromarray(unmasked_unchanged_image_arr.round().astype("uint8")) | |
unmasked_unchanged_image.save("force_unmasked_unchanged.png") | |
``` | |
Forcing the unmasked portion of the image to remain the same might result in some weird transitions between the unmasked and masked areas, since the model will typically change the masked and unmasked areas to make the transition more natural. | |