File size: 156 Bytes
02af082
 
 
 
 
1
2
3
4
5
6
# Jamba

qlora w/ deepspeed needs at least 2x GPUs and 35GiB VRAM per GPU

qlora single-gpu - training will start, but loss is off by an order of magnitude