ShermanG commited on
Commit
f38bc5d
·
verified ·
1 Parent(s): 448631e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -2
README.md CHANGED
@@ -1,6 +1,56 @@
1
  ---
2
  library_name: diffusers
3
  ---
4
- # ControlNet Standard Lineart for Diffuser XL
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
- 1111
 
1
  ---
2
  library_name: diffusers
3
  ---
4
+ # ControlNet Standard Lineart for SDXL
5
+ SDXL has perfect content generation functions and amazing LoRa performance, but its ControlNet is always its drawback, filltering out most of the users. Based on the computational power constrains of personal GPU, one cannot easily train and tune a perfect ControlNet models.
6
+
7
+
8
+ **This model attempts to fill the insufficiency of the ControlNet for SDXL to lower the requirements for SDXL to personal users.**
9
+
10
+ ## Environment Setup and Usage
11
+
12
+ The training [script](https://github.com/huggingface/diffusers/blob/main/examples/controlnet/train_controlnet_sdxl.py) used is from official Diffuser library.
13
+
14
+ The environment setup guide can be found by the [official Diffuser guide](https://github.com/huggingface/diffusers/tree/main).
15
+
16
+ Usage example:
17
+ ```python
18
+ from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel, AutoencoderKL
19
+ from diffusers.utils import load_image
20
+ import numpy as np
21
+ import torch
22
+ from PIL import Image
23
+
24
+ controlnet_conditioning_scale = 0.9
25
+ controlnet = ControlNetModel.from_pretrained(
26
+ "path/to/this/directory", torch_dtype=torch.float16
27
+ )
28
+ vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
29
+
30
+ pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
31
+ "stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16
32
+ )
33
+ pipe.enable_model_cpu_offload()
34
+
35
+ prompt = "Your prompt"
36
+ negative_prompt = "Your negative prompt"
37
+ line = Image.open("path/to/your/controling/image")
38
+
39
+ image = pipe(
40
+ prompt,
41
+ controlnet_conditioning_scale=controlnet_conditioning_scale,
42
+ image=line
43
+ ).images[0]
44
+ ```
45
+
46
+ ## Training Setup:
47
+
48
+ - **Base Model**: stabilityai/stable-diffusion-xl-base-1.0
49
+ - **Dataset**: [cc12m](https://github.com/rom1504/img2dataset) with 1024 resolution and up and over 300k images pairs. Cropped or used [image restoration](https://github.com/xinntao/Real-ESRGAN) resizing to 1024x1024 square images to feed into script.
50
+ - **Lineart**: Used ***LineartStandardDetector*** from ***controlnet_aux*** to extract controling images.
51
+ - **Total Batch Size**: 16 (4 gradient accumlation step * 4 GPU in parallel)
52
+ - **Steps**: 50k
53
+
54
+ ## Result:
55
+
56