Spaces:
Runtime error
Runtime error
File size: 4,400 Bytes
87d40d2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
# μ¬λ¬ GPUλ₯Ό μ¬μ©ν λΆμ° μΆλ‘
λΆμ° μ€μ μμλ μ¬λ¬ κ°μ ν둬ννΈλ₯Ό λμμ μμ±ν λ μ μ©ν π€ [Accelerate](https://huggingface.co/docs/accelerate/index) λλ [PyTorch Distributed](https://pytorch.org/tutorials/beginner/dist_overview.html)λ₯Ό μ¬μ©νμ¬ μ¬λ¬ GPUμμ μΆλ‘ μ μ€νν μ μμ΅λλ€.
μ΄ κ°μ΄λμμλ λΆμ° μΆλ‘ μ μν΄ π€ Accelerateμ PyTorch Distributedλ₯Ό μ¬μ©νλ λ°©λ²μ 보μ¬λ립λλ€.
## π€ Accelerate
π€ [Accelerate](https://huggingface.co/docs/accelerate/index)λ λΆμ° μ€μ μμ μΆλ‘ μ μ½κ² νλ ¨νκ±°λ μ€νν μ μλλ‘ μ€κ³λ λΌμ΄λΈλ¬λ¦¬μ
λλ€. λΆμ° νκ²½ μ€μ νλ‘μΈμ€λ₯Ό κ°μννμ¬ PyTorch μ½λμ μ§μ€ν μ μλλ‘ ν΄μ€λλ€.
μμνλ €λ©΄ Python νμΌμ μμ±νκ³ [`accelerate.PartialState`]λ₯Ό μ΄κΈ°ννμ¬ λΆμ° νκ²½μ μμ±νλ©΄, μ€μ μ΄ μλμΌλ‘ κ°μ§λλ―λ‘ `rank` λλ `world_size`λ₯Ό λͺ
μμ μΌλ‘ μ μν νμκ° μμ΅λλ€. ['DiffusionPipeline`]μ `distributed_state.device`λ‘ μ΄λνμ¬ κ° νλ‘μΈμ€μ GPUλ₯Ό ν λΉν©λλ€.
μ΄μ 컨ν
μ€νΈ κ΄λ¦¬μλ‘ [`~accelerate.PartialState.split_between_processes`] μ νΈλ¦¬ν°λ₯Ό μ¬μ©νμ¬ νλ‘μΈμ€ μμ λ°λΌ ν둬ννΈλ₯Ό μλμΌλ‘ λΆλ°°ν©λλ€.
```py
from accelerate import PartialState
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
distributed_state = PartialState()
pipeline.to(distributed_state.device)
with distributed_state.split_between_processes(["a dog", "a cat"]) as prompt:
result = pipeline(prompt).images[0]
result.save(f"result_{distributed_state.process_index}.png")
```
Use the `--num_processes` argument to specify the number of GPUs to use, and call `accelerate launch` to run the script:
```bash
accelerate launch run_distributed.py --num_processes=2
```
<Tip>μμΈν λ΄μ©μ [π€ Accelerateλ₯Ό μ¬μ©ν λΆμ° μΆλ‘ ](https://huggingface.co/docs/accelerate/en/usage_guides/distributed_inference#distributed-inference-with-accelerate) κ°μ΄λλ₯Ό μ°Έμ‘°νμΈμ.
</Tip>
## Pytoerch λΆμ°
PyTorchλ λ°μ΄ν° λ³λ ¬ μ²λ¦¬λ₯Ό κ°λ₯νκ² νλ [`DistributedDataParallel`](https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html)μ μ§μν©λλ€.
μμνλ €λ©΄ Python νμΌμ μμ±νκ³ `torch.distributed` λ° `torch.multiprocessing`μ μν¬νΈνμ¬ λΆμ° νλ‘μΈμ€ κ·Έλ£Ήμ μ€μ νκ³ κ° GPUμμ μΆλ‘ μ© νλ‘μΈμ€λ₯Ό μμ±ν©λλ€. κ·Έλ¦¬κ³ [`DiffusionPipeline`]λ μ΄κΈ°νν΄μΌ ν©λλ€:
νμ° νμ΄νλΌμΈμ `rank`λ‘ μ΄λνκ³ `get_rank`λ₯Ό μ¬μ©νμ¬ κ° νλ‘μΈμ€μ GPUλ₯Ό ν λΉνλ©΄ κ° νλ‘μΈμ€κ° λ€λ₯Έ ν둬ννΈλ₯Ό μ²λ¦¬ν©λλ€:
```py
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from diffusers import DiffusionPipeline
sd = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
```
μ¬μ©ν λ°±μλ μ ν, νμ¬ νλ‘μΈμ€μ `rank`, `world_size` λλ μ°Έμ¬νλ νλ‘μΈμ€ μλ‘ λΆμ° νκ²½ μμ±μ μ²λ¦¬νλ ν¨μ[`init_process_group`]λ₯Ό λ§λ€μ΄ μΆλ‘ μ μ€νν΄μΌ ν©λλ€.
2κ°μ GPUμμ μΆλ‘ μ λ³λ ¬λ‘ μ€ννλ κ²½μ° `world_size`λ 2μ
λλ€.
```py
def run_inference(rank, world_size):
dist.init_process_group("nccl", rank=rank, world_size=world_size)
sd.to(rank)
if torch.distributed.get_rank() == 0:
prompt = "a dog"
elif torch.distributed.get_rank() == 1:
prompt = "a cat"
image = sd(prompt).images[0]
image.save(f"./{'_'.join(prompt)}.png")
```
λΆμ° μΆλ‘ μ μ€ννλ €λ©΄ [`mp.spawn`](https://pytorch.org/docs/stable/multiprocessing.html#torch.multiprocessing.spawn)μ νΈμΆνμ¬ `world_size`μ μ μλ GPU μμ λν΄ `run_inference` ν¨μλ₯Ό μ€νν©λλ€:
```py
def main():
world_size = 2
mp.spawn(run_inference, args=(world_size,), nprocs=world_size, join=True)
if __name__ == "__main__":
main()
```
μΆλ‘ μ€ν¬λ¦½νΈλ₯Ό μλ£νμΌλ©΄ `--nproc_per_node` μΈμλ₯Ό μ¬μ©νμ¬ μ¬μ©ν GPU μλ₯Ό μ§μ νκ³ `torchrun`μ νΈμΆνμ¬ μ€ν¬λ¦½νΈλ₯Ό μ€νν©λλ€:
```bash
torchrun run_distributed.py --nproc_per_node=2
``` |