|
--- |
|
license: apache-2.0 |
|
tags: |
|
- openpose |
|
- controlnet |
|
- diffusers |
|
- controlnet-openpose-sdxl-1.0 |
|
- text_to_image |
|
--- |
|
# ***State of the art ControlNet-openpose-sdxl-1.0 model, below are the result for midjourney and anime, just for show*** |
|
data:image/s3,"s3://crabby-images/0d795/0d795fd94d3064d65d1b02df7c933a194070ed3b" alt="images" |
|
data:image/s3,"s3://crabby-images/bcff9/bcff916c39969ed266c71390662a1d5d5dae119f" alt="images" |
|
|
|
|
|
### controlnet-openpose-sdxl-1.0 |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
- **Developed by:** xinsir |
|
- **Model type:** ControlNet_SDXL |
|
- **License:** apache-2.0 |
|
- **Finetuned from model [optional]:** stabilityai/stable-diffusion-xl-base-1.0 |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Paper [optional]:** https://arxiv.org/abs/2302.05543 |
|
- |
|
|
|
### Examples |
|
data:image/s3,"s3://crabby-images/78d0a/78d0a2b916f6c6cb116b356a40365aae29d95e7b" alt="images10" |
|
data:image/s3,"s3://crabby-images/7ff94/7ff94a7d59ff6a650b959f22a2f051f76a809e9b" alt="images20" |
|
data:image/s3,"s3://crabby-images/5b4bf/5b4bf1119c25b40326e619084313e2cfd2fa016d" alt="images30" |
|
data:image/s3,"s3://crabby-images/e98d7/e98d764648dbcf26308b5e12164725a78e6b4f8d" alt="images40" |
|
data:image/s3,"s3://crabby-images/3cd32/3cd32f2056a73cc646886a998630ffbad3e1051c" alt="images50" |
|
data:image/s3,"s3://crabby-images/02aea/02aea286db39fab1e1af39778f2a6ddc75bb0b6d" alt="images60" |
|
data:image/s3,"s3://crabby-images/116b8/116b82bbf837a48335c196f8ebd7962aa522d5da" alt="images70" |
|
data:image/s3,"s3://crabby-images/8a910/8a9108d4cc4c7d83cf649393d045ed347ce6ddc9" alt="images80" |
|
data:image/s3,"s3://crabby-images/bde51/bde5141c29d9c5c9eed7f6627353e8fcc1fd0e22" alt="images90" |
|
data:image/s3,"s3://crabby-images/58cd4/58cd47b4db116f31decd3677098523df824487db" alt="images99" |
|
|
|
data:image/s3,"s3://crabby-images/7e03a/7e03aaf69070b53dd44e22a42ecede74bb52942c" alt="images0" |
|
data:image/s3,"s3://crabby-images/231dd/231dd1fd01e96602ec45ee274be02515760846ac" alt="images1" |
|
data:image/s3,"s3://crabby-images/3e607/3e607c45da8ad1459a8709f59ce845dd951505c7" alt="images2" |
|
data:image/s3,"s3://crabby-images/0f6bd/0f6bd277e98a13e4bd5f202e87af4ae2c7b555e2" alt="images3" |
|
data:image/s3,"s3://crabby-images/f68c2/f68c264b0eb4dd048fd7eea28ded962fc8cfa1ef" alt="images4" |
|
data:image/s3,"s3://crabby-images/2d4fd/2d4fdc22ebf8ffc5443ad5a32eaefbbd62d84b4a" alt="images5" |
|
data:image/s3,"s3://crabby-images/4cb16/4cb1636333592e3312ba07b86a5ca8e233e310ae" alt="images6" |
|
data:image/s3,"s3://crabby-images/47c1c/47c1c13230a164c6cfc885750a4db748da0cfc04" alt="images7" |
|
data:image/s3,"s3://crabby-images/9a899/9a8998cbe98bd054384c78640b48d10727f7bf46" alt="images8" |
|
data:image/s3,"s3://crabby-images/cf833/cf8331aca70e10c3c31fbbc746fee7940f6e6e5c" alt="images9" |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
```python |
|
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL |
|
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler |
|
from controlnet_aux import OpenposeDetector |
|
from PIL import Image |
|
import torch |
|
import numpy as np |
|
import cv2 |
|
|
|
|
|
|
|
controlnet_conditioning_scale = 1.0 |
|
prompt = "your prompt, the longer the better, you can describe it as detail as possible" |
|
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality' |
|
|
|
|
|
|
|
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler") |
|
|
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"xinsir/controlnet-openpose-sdxl-1.0", |
|
torch_dtype=torch.float16 |
|
) |
|
|
|
# when test with other base model, you need to change the vae also. |
|
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16) |
|
|
|
|
|
pipe = StableDiffusionXLControlNetPipeline.from_pretrained( |
|
"stabilityai/stable-diffusion-xl-base-1.0", |
|
controlnet=controlnet, |
|
vae=vae, |
|
safety_checker=None, |
|
torch_dtype=torch.float16, |
|
scheduler=eulera_scheduler, |
|
) |
|
|
|
processor = OpenposeDetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
|
|
controlnet_img = cv2.imread("your image path") |
|
controlnet_img = processor(controlnet_img, hand_and_face=False, output_type='cv2') |
|
|
|
|
|
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance |
|
height, width, _ = controlnet_img.shape |
|
ratio = np.sqrt(1024. * 1024. / (width * height)) |
|
new_width, new_height = int(width * ratio), int(height * ratio) |
|
controlnet_img = cv2.resize(controlnet_img, (new_width, new_height)) |
|
controlnet_img = Image.fromarray(controlnet_img) |
|
|
|
images = pipe( |
|
prompt, |
|
negative_prompt=negative_prompt, |
|
image=controlnet_img, |
|
controlnet_conditioning_scale=controlnet_conditioning_scale, |
|
width=new_width, |
|
height=new_height, |
|
num_inference_steps=30, |
|
).images |
|
|
|
images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger") |
|
``` |
|
|
|
|
|
## Evaluation Data |
|
HumanArt [https://github.com/IDEA-Research/HumanArt], select 2000 images with ground truth pose annotations to generate images and calculate mAP. |
|
|
|
|
|
|
|
## Quantitative Result |
|
| metric | xinsir/controlnet-openpose-sdxl-1.0 | lllyasviel/control_v11p_sd15_openpose | thibaud/controlnet-openpose-sdxl-1.0 | |
|
|-------|-------|-------|-------| |
|
| mAP | **0.357** | 0.326 | 0.209 | |
|
|
|
We are the SOTA openpose model compared with other opensource models. |