• Using opencsg/csg-wukong-2b-chinese-fineweb-edu as base model, we fine-tune it on smoltalk-chinese for 2 epoch
  • learning rate = 3e-4 ; global batch size = 32 ; lr scheduler=cosine
Downloads last month
18
Safetensors
Model size
2.17B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for opencsg/csg-wukong-2b-smoltalk-chinese

Finetuned
(1)
this model
Finetunes
2 models

Dataset used to train opencsg/csg-wukong-2b-smoltalk-chinese

Collection including opencsg/csg-wukong-2b-smoltalk-chinese