thenlper tastelikefeet commited on
Commit
a8d08b3
·
verified ·
1 Parent(s): 132a598

Support fine-tuning (#52)

Browse files

- Support fine-tuning (b552258c803a26b02e8bc80811458a80938f9ab0)


Co-authored-by: tastelikefeet <[email protected]>

Files changed (1) hide show
  1. README.md +40 -0
README.md CHANGED
@@ -5685,6 +5685,46 @@ In addition to the open-source [GTE](https://huggingface.co/collections/Alibaba-
5685
 
5686
  Note that the models behind the commercial APIs are not entirely identical to the open-source models.
5687
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5688
  ## Citation
5689
 
5690
  If you find our paper or models helpful, please consider cite:
 
5685
 
5686
  Note that the models behind the commercial APIs are not entirely identical to the open-source models.
5687
 
5688
+ ## Community support
5689
+
5690
+ ### Fine-tuning
5691
+
5692
+ GTE models can be fine-tuned with a third party framework SWIFT.
5693
+
5694
+ ```shell
5695
+ pip install ms-swift -U
5696
+ ```
5697
+
5698
+ ```shell
5699
+ # check: https://swift.readthedocs.io/en/latest/BestPractices/Embedding.html
5700
+ nproc_per_node=8
5701
+ NPROC_PER_NODE=$nproc_per_node \
5702
+ USE_HF=1 \
5703
+ swift sft \
5704
+ --model Alibaba-NLP/gte-Qwen2-7B-instruct \
5705
+ --train_type lora \
5706
+ --dataset 'sentence-transformers/stsb' \
5707
+ --torch_dtype bfloat16 \
5708
+ --num_train_epochs 10 \
5709
+ --per_device_train_batch_size 2 \
5710
+ --per_device_eval_batch_size 1 \
5711
+ --gradient_accumulation_steps $(expr 64 / $nproc_per_node) \
5712
+ --eval_steps 100 \
5713
+ --save_steps 100 \
5714
+ --eval_strategy steps \
5715
+ --use_chat_template false \
5716
+ --save_total_limit 5 \
5717
+ --logging_steps 5 \
5718
+ --output_dir output \
5719
+ --warmup_ratio 0.05 \
5720
+ --learning_rate 5e-6 \
5721
+ --deepspeed zero3 \
5722
+ --dataloader_num_workers 4 \
5723
+ --task_type embedding \
5724
+ --loss_type cosine_similarity \
5725
+ --dataloader_drop_last true
5726
+ ```
5727
+
5728
  ## Citation
5729
 
5730
  If you find our paper or models helpful, please consider cite: