File size: 498 Bytes
5fa1a76
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
For example, to estimate the memory requirements for the bigscience/T0_3B model on a single GPU:

$ python -c 'from transformers import AutoModel; \
from deepspeed.runtime.zero.stage3 import estimate_zero3_model_states_mem_needs_all_live; \
model = AutoModel.from_pretrained("bigscience/T0_3B"); \
estimate_zero3_model_states_mem_needs_all_live(model, num_gpus_per_node=1, num_nodes=1)'
[]
Estimated memory needed for params, optim states and gradients for a:
HW: Setup with 1 node, 1 GPU per node.