File size: 498 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 9 |
For example, to estimate the memory requirements for the bigscience/T0_3B model on a single GPU: $ python -c 'from transformers import AutoModel; \ from deepspeed.runtime.zero.stage3 import estimate_zero3_model_states_mem_needs_all_live; \ model = AutoModel.from_pretrained("bigscience/T0_3B"); \ estimate_zero3_model_states_mem_needs_all_live(model, num_gpus_per_node=1, num_nodes=1)' [] Estimated memory needed for params, optim states and gradients for a: HW: Setup with 1 node, 1 GPU per node. |