File size: 251 Bytes
5fa1a76
 
 
 
1
2
3
4
T
Tensor Parallelism (TP)
Parallelism technique for training on multiple GPUs in which each tensor is split up into multiple chunks, so instead of 
having the whole tensor reside on a single GPU, each shard of the tensor resides on its designated GPU.