pablo
add diffusers fork
a63d2a4
# ์ด๋ฏธ์ง€ ๋ฐ๊ธฐ ์กฐ์ ˆํ•˜๊ธฐ
Stable Diffusion ํŒŒ์ดํ”„๋ผ์ธ์€ [์ผ๋ฐ˜์ ์ธ ๋””ํ“จ์ „ ๋…ธ์ด์ฆˆ ์Šค์ผ€์ค„๊ณผ ์ƒ˜ํ”Œ ๋‹จ๊ณ„์— ๊ฒฐํ•จ์ด ์žˆ์Œ](https://huggingface.co/papers/2305.08891) ๋…ผ๋ฌธ์—์„œ ์„ค๋ช…ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ๋งค์šฐ ๋ฐ๊ฑฐ๋‚˜ ์–ด๋‘์šด ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ๋Š” ์„ฑ๋Šฅ์ด ํ‰๋ฒ”ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ์†”๋ฃจ์…˜์€ ํ˜„์žฌ [`DDIMScheduler`]์— ๊ตฌํ˜„๋˜์–ด ์žˆ์œผ๋ฉฐ ์ด๋ฏธ์ง€์˜ ๋ฐ๊ธฐ๋ฅผ ๊ฐœ์„ ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
<Tip>
๐Ÿ’ก ์ œ์•ˆ๋œ ์†”๋ฃจ์…˜์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์œ„์— ๋งํฌ๋œ ๋…ผ๋ฌธ์„ ์ฐธ๊ณ ํ•˜์„ธ์š”!
</Tip>
ํ•ด๊ฒฐ์ฑ… ์ค‘ ํ•˜๋‚˜๋Š” *v ์˜ˆ์ธก๊ฐ’*๊ณผ *v ๋กœ์Šค*๋กœ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ flag๋ฅผ [`train_text_to_image.py`](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py) ๋˜๋Š” [`train_text_to_image_lora.py`](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora.py) ์Šคํฌ๋ฆฝํŠธ์— ์ถ”๊ฐ€ํ•˜์—ฌ `v_prediction`์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค:
```bash
--prediction_type="v_prediction"
```
์˜ˆ๋ฅผ ๋“ค์–ด, `v_prediction`์œผ๋กœ ๋ฏธ์„ธ ์กฐ์ •๋œ [`ptx0/pseudo-journey-v2`](https://huggingface.co/ptx0/pseudo-journey-v2) ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
๋‹ค์Œ์œผ๋กœ [`DDIMScheduler`]์—์„œ ๋‹ค์Œ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค:
1. rescale_betas_zero_snr=True`, ๋…ธ์ด์ฆˆ ์Šค์ผ€์ค„์„ ์ œ๋กœ ํ„ฐ๋ฏธ๋„ ์‹ ํ˜ธ ๋Œ€ ์žก์Œ๋น„(SNR)๋กœ ์žฌ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
2. `timestep_spacing="trailing"`, ๋งˆ์ง€๋ง‰ ํƒ€์ž„์Šคํ…๋ถ€ํ„ฐ ์ƒ˜ํ”Œ๋ง ์‹œ์ž‘
```py
>>> from diffusers import DiffusionPipeline, DDIMScheduler
>>> pipeline = DiffusionPipeline.from_pretrained("ptx0/pseudo-journey-v2")
# switch the scheduler in the pipeline to use the DDIMScheduler
>>> pipeline.scheduler = DDIMScheduler.from_config(
... pipeline.scheduler.config, rescale_betas_zero_snr=True, timestep_spacing="trailing"
... )
>>> pipeline.to("cuda")
```
๋งˆ์ง€๋ง‰์œผ๋กœ ํŒŒ์ดํ”„๋ผ์ธ์— ๋Œ€ํ•œ ํ˜ธ์ถœ์—์„œ `guidance_rescale`์„ ์„ค์ •ํ•˜์—ฌ ๊ณผ๋‹ค ๋…ธ์ถœ์„ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๋‹ค:
```py
prompt = "A lion in galaxies, spirals, nebulae, stars, smoke, iridescent, intricate detail, octane render, 8k"
image = pipeline(prompt, guidance_rescale=0.7).images[0]
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/zero_snr.png"/>
</div>