Text-to-Image
Diffusers
English
Ldhlwh commited on
Commit
73d7814
·
verified ·
1 Parent(s): 8d3907e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -5
README.md CHANGED
@@ -4,6 +4,7 @@ language:
4
  - en
5
  library_name: diffusers
6
  pipeline_tag: text-to-image
 
7
  ---
8
 
9
  # Target-Driven Distillation
@@ -15,12 +16,12 @@ pipeline_tag: text-to-image
15
 
16
  </div>
17
 
18
- ## Introduction
19
-
20
  Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance
21
 
22
- <div align="center">
23
- <img src='teaser.jpg'>
 
 
24
  </div>
25
 
26
  ## Update
@@ -67,4 +68,34 @@ image = pipe(
67
  ).images[0]
68
 
69
  image.save("tdd.png")
70
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - en
5
  library_name: diffusers
6
  pipeline_tag: text-to-image
7
+ base_model: stabilityai/stable-diffusion-xl-base-1.0
8
  ---
9
 
10
  # Target-Driven Distillation
 
16
 
17
  </div>
18
 
 
 
19
  Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance
20
 
21
+ <div align="center">
22
+ <img src="assets/teaser.jpg" alt="teaser" style="zoom:80%;" />
23
+
24
+ Samples generated by TDD-distilled SDXL, with only 4--8 steps.
25
  </div>
26
 
27
  ## Update
 
68
  ).images[0]
69
 
70
  image.save("tdd.png")
71
+ ```
72
+
73
+ ## Introduction
74
+
75
+ Target-Driven Distillation (TDD) features three key designs, that differ from previous consistency distillation methods.
76
+ 1. **TDD adopts a delicate selection strategy of target timesteps, increasing the training efficiency.** Specifically, it first chooses from a predefined set of equidistant denoising schedules (*e.g.* 4--8 steps), then adds a stochatic offset to accomodate non-deterministic sampling (*e.g.* $\gamma$-sampling).
77
+ 2. **TDD utilizes decoupled guidances during training, making itself open to post-tuning on guidance scale during inference periods.** Specifically, it replaces a portion of the text conditions with unconditional (*i.e.* empty) prompts, in order to align with the standard training process using CFG.
78
+ 3. **TDD can be optionally equipped with non-equidistant sampling and x0 clipping, enabling a more flexible and accurate way for image sampling.**
79
+
80
+ <div align="center">
81
+ <img src="assets/tdd_overview.jpg" alt="overview"/>
82
+
83
+ An overview of TDD. (a) The training process features target timestep selection and decoupled guidance. (b) The inference process can optionally adopt non-equidistant denoising schedules.
84
+ </div>
85
+
86
+ <div align="center">
87
+ <img src="assets/compare.png" alt="comparison" style="zoom:80%;" />
88
+
89
+ Samples generated by SDXL models distilled by mainstream consistency distillation methods LCM, PCM, TCD, and our TDD, from the same seeds. Our method demonstrates advantages in both image complexity and clarity.
90
+ </div>
91
+
92
+ <div align="center">
93
+ <img src="assets/other_1.jpg" alt="other"/>
94
+
95
+ Samples generated by TDD-distilled different base models, and by SDXL with different LoRA adapters or ControlNets.
96
+ </div>
97
+
98
+
99
+ <div align="center">
100
+ Video samples generated by AnimateLCM-distilled (top) and TDD-distilled (bottom) SVD-xt 1.1, also with 4--8 steps.
101
+ </div>