BleachNick's picture
upload required packages
87d40d2
|
raw
history blame
11.9 kB
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# μŠ€μΌ€μ€„λŸ¬
diffusion νŒŒμ΄ν”„λΌμΈμ€ diffusion λͺ¨λΈ, μŠ€μΌ€μ€„λŸ¬ λ“±μ˜ μ»΄ν¬λ„ŒνŠΈλ“€λ‘œ κ΅¬μ„±λ©λ‹ˆλ‹€. 그리고 νŒŒμ΄ν”„λΌμΈ μ•ˆμ˜ 일뢀 μ»΄ν¬λ„ŒνŠΈλ₯Ό λ‹€λ₯Έ μ»΄ν¬λ„ŒνŠΈλ‘œ κ΅μ²΄ν•˜λŠ” μ‹μ˜ μ»€μŠ€ν„°λ§ˆμ΄μ§• μ—­μ‹œ κ°€λŠ₯ν•©λ‹ˆλ‹€. 이와 같은 μ»΄ν¬λ„ŒνŠΈ μ»€μŠ€ν„°λ§ˆμ΄μ§•μ˜ κ°€μž₯ λŒ€ν‘œμ μΈ μ˜ˆμ‹œκ°€ λ°”λ‘œ [μŠ€μΌ€μ€„λŸ¬](../api/schedulers/overview.md)λ₯Ό κ΅μ²΄ν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€.
μŠ€μΌ€μ₯΄λŸ¬λŠ” λ‹€μŒκ³Ό 같이 diffusion μ‹œμŠ€ν…œμ˜ μ „λ°˜μ μΈ 디노이징 ν”„λ‘œμ„ΈμŠ€λ₯Ό μ •μ˜ν•©λ‹ˆλ‹€.
- 디노이징 μŠ€ν…μ„ μ–Όλ§ˆλ‚˜ κ°€μ Έκ°€μ•Ό ν• κΉŒ?
- ν™•λ₯ μ μœΌλ‘œ(stochastic) ν˜Ήμ€ ν™•μ •μ μœΌλ‘œ(deterministic)?
- 디노이징 된 μƒ˜ν”Œμ„ μ°Ύμ•„λ‚΄κΈ° μœ„ν•΄ μ–΄λ–€ μ•Œκ³ λ¦¬μ¦˜μ„ μ‚¬μš©ν•΄μ•Ό ν• κΉŒ?
μ΄λŸ¬ν•œ ν”„λ‘œμ„ΈμŠ€λŠ” λ‹€μ†Œ λ‚œν•΄ν•˜κ³ , 디노이징 속도와 디노이징 퀄리티 μ‚¬μ΄μ˜ νŠΈλ ˆμ΄λ“œ μ˜€ν”„λ₯Ό μ •μ˜ν•΄μ•Ό ν•˜λŠ” λ¬Έμ œκ°€ 될 수 μžˆμŠ΅λ‹ˆλ‹€. 주어진 νŒŒμ΄ν”„λΌμΈμ— μ–΄λ–€ μŠ€μΌ€μ€„λŸ¬κ°€ κ°€μž₯ μ ν•©ν•œμ§€λ₯Ό μ •λŸ‰μ μœΌλ‘œ νŒλ‹¨ν•˜λŠ” 것은 맀우 μ–΄λ €μš΄ μΌμž…λ‹ˆλ‹€. 이둜 인해 일단 ν•΄λ‹Ή μŠ€μΌ€μ€„λŸ¬λ₯Ό 직접 μ‚¬μš©ν•˜μ—¬, μƒμ„±λ˜λŠ” 이미지λ₯Ό 직접 눈으둜 보며, μ •μ„±μ μœΌλ‘œ μ„±λŠ₯을 νŒλ‹¨ν•΄λ³΄λŠ” 것이 μΆ”μ²œλ˜κ³€ ν•©λ‹ˆλ‹€.
## νŒŒμ΄ν”„λΌμΈ 뢈러였기
λ¨Όμ € μŠ€ν…Œμ΄λΈ” diffusion νŒŒμ΄ν”„λΌμΈμ„ λΆˆλŸ¬μ˜€λ„λ‘ ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. λ¬Όλ‘  μŠ€ν…Œμ΄λΈ” diffusion을 μ‚¬μš©ν•˜κΈ° μœ„ν•΄μ„œλŠ”, ν—ˆκΉ…νŽ˜μ΄μŠ€ ν—ˆλΈŒμ— λ“±λ‘λœ μ‚¬μš©μžμ—¬μ•Ό ν•˜λ©°, κ΄€λ ¨ [λΌμ΄μ„ΌμŠ€](https://huggingface.co/runwayml/stable-diffusion-v1-5)에 λ™μ˜ν•΄μ•Ό ν•œλ‹€λŠ” 점을 μžŠμ§€ λ§μ•„μ£Όμ„Έμš”.
*μ—­μž μ£Ό: λ‹€λ§Œ, ν˜„μž¬ μ‹ κ·œλ‘œ μƒμ„±ν•œ ν—ˆκΉ…νŽ˜μ΄μŠ€ 계정에 λŒ€ν•΄μ„œλŠ” λΌμ΄μ„ΌμŠ€ λ™μ˜λ₯Ό μš”κ΅¬ν•˜μ§€ μ•ŠλŠ” κ²ƒμœΌλ‘œ λ³΄μž…λ‹ˆλ‹€!*
```python
from huggingface_hub import login
from diffusers import DiffusionPipeline
import torch
# first we need to login with our access token
login()
# Now we can download the pipeline
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
```
λ‹€μŒμœΌλ‘œ, GPU둜 μ΄λ™ν•©λ‹ˆλ‹€.
```python
pipeline.to("cuda")
```
## μŠ€μΌ€μ€„λŸ¬ μ•‘μ„ΈμŠ€
μŠ€μΌ€μ€„λŸ¬λŠ” μ–Έμ œλ‚˜ νŒŒμ΄ν”„λΌμΈμ˜ μ»΄ν¬λ„ŒνŠΈλ‘œμ„œ μ‘΄μž¬ν•˜λ©°, 일반적으둜 νŒŒμ΄ν”„λΌμΈ μΈμŠ€ν„΄μŠ€ 내에 `scheduler`λΌλŠ” μ΄λ¦„μ˜ 속성(property)으둜 μ •μ˜λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.
```python
pipeline.scheduler
```
**Output**:
```
PNDMScheduler {
"_class_name": "PNDMScheduler",
"_diffusers_version": "0.8.0.dev0",
"beta_end": 0.012,
"beta_schedule": "scaled_linear",
"beta_start": 0.00085,
"clip_sample": false,
"num_train_timesteps": 1000,
"set_alpha_to_one": false,
"skip_prk_steps": true,
"steps_offset": 1,
"trained_betas": null
}
```
좜λ ₯ κ²°κ³Όλ₯Ό 톡해, μš°λ¦¬λŠ” ν•΄λ‹Ή μŠ€μΌ€μ€„λŸ¬κ°€ [`PNDMScheduler`]의 μΈμŠ€ν„΄μŠ€λΌλŠ” 것을 μ•Œ 수 μžˆμŠ΅λ‹ˆλ‹€. 이제 [`PNDMScheduler`]와 λ‹€λ₯Έ μŠ€μΌ€μ€„λŸ¬λ“€μ˜ μ„±λŠ₯을 비ꡐ해보도둝 ν•˜κ² μŠ΅λ‹ˆλ‹€. λ¨Όμ € ν…ŒμŠ€νŠΈμ— μ‚¬μš©ν•  ν”„λ‘¬ν”„νŠΈλ₯Ό λ‹€μŒκ³Ό 같이 μ •μ˜ν•΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€.
```python
prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
```
λ‹€μŒμœΌλ‘œ μœ μ‚¬ν•œ 이미지 생성을 보μž₯ν•˜κΈ° μœ„ν•΄μ„œ, λ‹€μŒκ³Ό 같이 λžœλ€μ‹œλ“œλ₯Ό 고정해주도둝 ν•˜κ² μŠ΅λ‹ˆλ‹€.
```python
generator = torch.Generator(device="cuda").manual_seed(8)
image = pipeline(prompt, generator=generator).images[0]
image
```
<p align="center">
<br>
<img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_pndm.png" width="400"/>
<br>
</p>
## μŠ€μΌ€μ€„λŸ¬ κ΅μ²΄ν•˜κΈ°
λ‹€μŒμœΌλ‘œ νŒŒμ΄ν”„λΌμΈμ˜ μŠ€μΌ€μ€„λŸ¬λ₯Ό λ‹€λ₯Έ μŠ€μΌ€μ€„λŸ¬λ‘œ κ΅μ²΄ν•˜λŠ” 방법에 λŒ€ν•΄ μ•Œμ•„λ³΄κ² μŠ΅λ‹ˆλ‹€. λͺ¨λ“  μŠ€μΌ€μ€„λŸ¬λŠ” [`SchedulerMixin.compatibles`]λΌλŠ” 속성(property)을 κ°–κ³  μžˆμŠ΅λ‹ˆλ‹€. ν•΄λ‹Ή 속성은 **ν˜Έν™˜ κ°€λŠ₯ν•œ** μŠ€μΌ€μ€„λŸ¬λ“€μ— λŒ€ν•œ 정보λ₯Ό λ‹΄κ³  μžˆμŠ΅λ‹ˆλ‹€.
```python
pipeline.scheduler.compatibles
```
**Output**:
```
[diffusers.schedulers.scheduling_lms_discrete.LMSDiscreteScheduler,
diffusers.schedulers.scheduling_ddim.DDIMScheduler,
diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler,
diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler,
diffusers.schedulers.scheduling_pndm.PNDMScheduler,
diffusers.schedulers.scheduling_ddpm.DDPMScheduler,
diffusers.schedulers.scheduling_euler_ancestral_discrete.EulerAncestralDiscreteScheduler]
```
ν˜Έν™˜λ˜λŠ” μŠ€μΌ€μ€„λŸ¬λ“€μ„ μ‚΄νŽ΄λ³΄λ©΄ μ•„λž˜μ™€ κ°™μŠ΅λ‹ˆλ‹€.
- [`LMSDiscreteScheduler`],
- [`DDIMScheduler`],
- [`DPMSolverMultistepScheduler`],
- [`EulerDiscreteScheduler`],
- [`PNDMScheduler`],
- [`DDPMScheduler`],
- [`EulerAncestralDiscreteScheduler`].
μ•žμ„œ μ •μ˜ν–ˆλ˜ ν”„λ‘¬ν”„νŠΈλ₯Ό μ‚¬μš©ν•΄μ„œ 각각의 μŠ€μΌ€μ€„λŸ¬λ“€μ„ 비ꡐ해보도둝 ν•˜κ² μŠ΅λ‹ˆλ‹€.
λ¨Όμ € νŒŒμ΄ν”„λΌμΈ μ•ˆμ˜ μŠ€μΌ€μ€„λŸ¬λ₯Ό λ°”κΎΈκΈ° μœ„ν•΄ [`ConfigMixin.config`] 속성과 [`ConfigMixin.from_config`] λ©”μ„œλ“œλ₯Ό ν™œμš©ν•΄λ³΄λ €κ³  ν•©λ‹ˆλ‹€.
```python
pipeline.scheduler.config
```
**Output**:
```
FrozenDict([('num_train_timesteps', 1000),
('beta_start', 0.00085),
('beta_end', 0.012),
('beta_schedule', 'scaled_linear'),
('trained_betas', None),
('skip_prk_steps', True),
('set_alpha_to_one', False),
('steps_offset', 1),
('_class_name', 'PNDMScheduler'),
('_diffusers_version', '0.8.0.dev0'),
('clip_sample', False)])
```
κΈ°μ‘΄ μŠ€μΌ€μ€„λŸ¬μ˜ configλ₯Ό ν˜Έν™˜ κ°€λŠ₯ν•œ λ‹€λ₯Έ μŠ€μΌ€μ€„λŸ¬μ— μ΄μ‹ν•˜λŠ” 것 μ—­μ‹œ κ°€λŠ₯ν•©λ‹ˆλ‹€.
λ‹€μŒ μ˜ˆμ‹œλŠ” κΈ°μ‘΄ μŠ€μΌ€μ€„λŸ¬(`pipeline.scheduler`)λ₯Ό λ‹€λ₯Έ μ’…λ₯˜μ˜ μŠ€μΌ€μ€„λŸ¬(`DDIMScheduler`)둜 λ°”κΎΈλŠ” μ½”λ“œμž…λ‹ˆλ‹€. κΈ°μ‘΄ μŠ€μΌ€μ€„λŸ¬κ°€ κ°–κ³  있던 configλ₯Ό `.from_config` λ©”μ„œλ“œμ˜ 인자둜 μ „λ‹¬ν•˜λŠ” 것을 확인할 수 μžˆμŠ΅λ‹ˆλ‹€.
```python
from diffusers import DDIMScheduler
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
```
이제 νŒŒμ΄ν”„λΌμΈμ„ μ‹€ν–‰ν•΄μ„œ 두 μŠ€μΌ€μ€„λŸ¬ μ‚¬μ΄μ˜ μƒμ„±λœ μ΄λ―Έμ§€μ˜ 퀄리티λ₯Ό λΉ„κ΅ν•΄λ΄…μ‹œλ‹€.
```python
generator = torch.Generator(device="cuda").manual_seed(8)
image = pipeline(prompt, generator=generator).images[0]
image
```
<p align="center">
<br>
<img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_ddim.png" width="400"/>
<br>
</p>
## μŠ€μΌ€μ€„λŸ¬λ“€ 비ꡐ해보기
μ§€κΈˆκΉŒμ§€λŠ” [`PNDMScheduler`]와 [`DDIMScheduler`] μŠ€μΌ€μ€„λŸ¬λ₯Ό μ‹€ν–‰ν•΄λ³΄μ•˜μŠ΅λ‹ˆλ‹€. 아직 비ꡐ해볼 μŠ€μΌ€μ€„λŸ¬λ“€μ΄ 더 많이 λ‚¨μ•„μžˆμœΌλ‹ˆ 계속 비ꡐ해보도둝 ν•˜κ² μŠ΅λ‹ˆλ‹€.
[`LMSDiscreteScheduler`]을 일반적으둜 더 쒋은 κ²°κ³Όλ₯Ό λ³΄μ—¬μ€λ‹ˆλ‹€.
```python
from diffusers import LMSDiscreteScheduler
pipeline.scheduler = LMSDiscreteScheduler.from_config(pipeline.scheduler.config)
generator = torch.Generator(device="cuda").manual_seed(8)
image = pipeline(prompt, generator=generator).images[0]
image
```
<p align="center">
<br>
<img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_lms.png" width="400"/>
<br>
</p>
[`EulerDiscreteScheduler`]와 [`EulerAncestralDiscreteScheduler`] κ³ μž‘ 30번의 inference stepλ§ŒμœΌλ‘œλ„ 높은 ν€„λ¦¬ν‹°μ˜ 이미지λ₯Ό μƒμ„±ν•˜λŠ” 것을 μ•Œ 수 μžˆμŠ΅λ‹ˆλ‹€.
```python
from diffusers import EulerDiscreteScheduler
pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config)
generator = torch.Generator(device="cuda").manual_seed(8)
image = pipeline(prompt, generator=generator, num_inference_steps=30).images[0]
image
```
<p align="center">
<br>
<img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_euler_discrete.png" width="400"/>
<br>
</p>
```python
from diffusers import EulerAncestralDiscreteScheduler
pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(pipeline.scheduler.config)
generator = torch.Generator(device="cuda").manual_seed(8)
image = pipeline(prompt, generator=generator, num_inference_steps=30).images[0]
image
```
<p align="center">
<br>
<img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_euler_ancestral.png" width="400"/>
<br>
</p>
μ§€κΈˆ 이 λ¬Έμ„œλ₯Ό μž‘μ„±ν•˜λŠ” ν˜„μ‹œμ  기쀀에선, [`DPMSolverMultistepScheduler`]κ°€ μ‹œκ°„ λŒ€λΉ„ κ°€μž₯ 쒋은 ν’ˆμ§ˆμ˜ 이미지λ₯Ό μƒμ„±ν•˜λŠ” 것 κ°™μŠ΅λ‹ˆλ‹€. 20번 μ •λ„μ˜ μŠ€ν…λ§ŒμœΌλ‘œλ„ 싀행될 수 μžˆμŠ΅λ‹ˆλ‹€.
```python
from diffusers import DPMSolverMultistepScheduler
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
generator = torch.Generator(device="cuda").manual_seed(8)
image = pipeline(prompt, generator=generator, num_inference_steps=20).images[0]
image
```
<p align="center">
<br>
<img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/diffusers_docs/astronaut_dpm.png" width="400"/>
<br>
</p>
λ³΄μ‹œλ‹€μ‹œν”Ό μƒμ„±λœ 이미지듀은 맀우 λΉ„μŠ·ν•˜κ³ , λΉ„μŠ·ν•œ 퀄리티λ₯Ό λ³΄μ΄λŠ” 것 κ°™μŠ΅λ‹ˆλ‹€. μ‹€μ œλ‘œ μ–΄λ–€ μŠ€μΌ€μ€„λŸ¬λ₯Ό 선택할 κ²ƒμΈκ°€λŠ” μ’…μ’… νŠΉμ • 이용 사둀에 κΈ°λ°˜ν•΄μ„œ κ²°μ •λ˜κ³€ ν•©λ‹ˆλ‹€. κ²°κ΅­ μ—¬λŸ¬ μ’…λ₯˜μ˜ μŠ€μΌ€μ€„λŸ¬λ₯Ό 직접 μ‹€ν–‰μ‹œμΌœλ³΄κ³  눈으둜 직접 λΉ„κ΅ν•΄μ„œ νŒλ‹¨ν•˜λŠ” 게 쒋은 선택일 것 κ°™μŠ΅λ‹ˆλ‹€.
## Flaxμ—μ„œ μŠ€μΌ€μ€„λŸ¬ κ΅μ²΄ν•˜κΈ°
JAX/Flax μ‚¬μš©μžμΈ 경우 κΈ°λ³Έ νŒŒμ΄ν”„λΌμΈ μŠ€μΌ€μ€„λŸ¬λ₯Ό λ³€κ²½ν•  μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€. λ‹€μŒμ€ Flax Stable Diffusion νŒŒμ΄ν”„λΌμΈκ³Ό μ΄ˆκ³ μ† [DDPM-Solver++ μŠ€μΌ€μ€„λŸ¬λ₯Ό](../api/schedulers/multistep_dpm_solver) μ‚¬μš©ν•˜μ—¬ 좔둠을 μ‹€ν–‰ν•˜λŠ” 방법에 λŒ€ν•œ μ˜ˆμ‹œμž…λ‹ˆλ‹€ .
```Python
import jax
import numpy as np
from flax.jax_utils import replicate
from flax.training.common_utils import shard
from diffusers import FlaxStableDiffusionPipeline, FlaxDPMSolverMultistepScheduler
model_id = "runwayml/stable-diffusion-v1-5"
scheduler, scheduler_state = FlaxDPMSolverMultistepScheduler.from_pretrained(
model_id,
subfolder="scheduler"
)
pipeline, params = FlaxStableDiffusionPipeline.from_pretrained(
model_id,
scheduler=scheduler,
revision="bf16",
dtype=jax.numpy.bfloat16,
)
params["scheduler"] = scheduler_state
# Generate 1 image per parallel device (8 on TPUv2-8 or TPUv3-8)
prompt = "a photo of an astronaut riding a horse on mars"
num_samples = jax.device_count()
prompt_ids = pipeline.prepare_inputs([prompt] * num_samples)
prng_seed = jax.random.PRNGKey(0)
num_inference_steps = 25
# shard inputs and rng
params = replicate(params)
prng_seed = jax.random.split(prng_seed, jax.device_count())
prompt_ids = shard(prompt_ids)
images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images
images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:])))
```
<Tip warning={true}>
λ‹€μŒ Flax μŠ€μΌ€μ€„λŸ¬λŠ” *아직* Flax Stable Diffusion νŒŒμ΄ν”„λΌμΈκ³Ό ν˜Έν™˜λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.
- `FlaxLMSDiscreteScheduler`
- `FlaxDDPMScheduler`
</Tip>