To reproduce this run:

accelerate launch --multi_gpu --mixed_precision=fp16 --num_processes=8 run_distillation.py config_mistral.yaml

Safetensors

Model size

1.57B params

Tensor type

F32

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

sanchit-gandhi
/

distil-mistral-1.5B-v0.1

Dataset used to train sanchit-gandhi/distil-mistral-1.5B-v0.1