|
--- |
|
license: apache-2.0 |
|
tags: |
|
- openpose |
|
- controlnet |
|
- diffusers |
|
- controlnet-openpose-sdxl-1.0 |
|
- text_to_image |
|
--- |
|
# ***State of the art ControlNet-openpose-sdxl-1.0 model, below are the result for midjourney and anime, just for show*** |
|
 |
|
 |
|
|
|
|
|
### controlnet-openpose-sdxl-1.0 |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
- **Developed by:** xinsir |
|
- **Model type:** ControlNet_SDXL |
|
- **License:** apache-2.0 |
|
- **Finetuned from model [optional]:** stabilityai/stable-diffusion-xl-base-1.0 |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Paper [optional]:** https://arxiv.org/abs/2302.05543 |
|
- |
|
|
|
### Examples |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
|
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
 |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
```python |
|
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL |
|
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler |
|
from controlnet_aux import OpenposeDetector |
|
from PIL import Image |
|
import torch |
|
import numpy as np |
|
import cv2 |
|
|
|
|
|
|
|
controlnet_conditioning_scale = 1.0 |
|
prompt = "your prompt, the longer the better, you can describe it as detail as possible" |
|
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality' |
|
|
|
|
|
|
|
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler") |
|
|
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"xinsir/controlnet-openpose-sdxl-1.0", |
|
torch_dtype=torch.float16 |
|
) |
|
|
|
# when test with other base model, you need to change the vae also. |
|
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16) |
|
|
|
|
|
pipe = StableDiffusionXLControlNetPipeline.from_pretrained( |
|
"stabilityai/stable-diffusion-xl-base-1.0", |
|
controlnet=controlnet, |
|
vae=vae, |
|
safety_checker=None, |
|
torch_dtype=torch.float16, |
|
scheduler=eulera_scheduler, |
|
) |
|
|
|
processor = OpenposeDetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
|
|
controlnet_img = cv2.imread("your image path") |
|
controlnet_img = processor(controlnet_img, hand_and_face=False, output_type='cv2') |
|
|
|
|
|
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance |
|
height, width, _ = controlnet_img.shape |
|
ratio = np.sqrt(1024. * 1024. / (width * height)) |
|
new_width, new_height = int(width * ratio), int(height * ratio) |
|
controlnet_img = cv2.resize(controlnet_img, (new_width, new_height)) |
|
controlnet_img = Image.fromarray(controlnet_img) |
|
|
|
images = pipe( |
|
prompt, |
|
negative_prompt=negative_prompt, |
|
image=controlnet_img, |
|
controlnet_conditioning_scale=controlnet_conditioning_scale, |
|
width=new_width, |
|
height=new_height, |
|
num_inference_steps=30, |
|
).images |
|
|
|
images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger") |
|
``` |
|
|
|
|
|
## Evaluation Data |
|
HumanArt [https://github.com/IDEA-Research/HumanArt], select 2000 images with ground truth pose annotations to generate images and calculate mAP. |
|
|
|
|
|
|
|
## Quantitative Result |
|
| metric | xinsir/controlnet-openpose-sdxl-1.0 | lllyasviel/control_v11p_sd15_openpose | thibaud/controlnet-openpose-sdxl-1.0 | |
|
|-------|-------|-------|-------| |
|
| mAP | **0.357** | 0.326 | 0.209 | |
|
|
|
We are the SOTA openpose model compared with other opensource models. |