Support fine-tuning (#52)
Browse files- Support fine-tuning (b552258c803a26b02e8bc80811458a80938f9ab0)
Co-authored-by: tastelikefeet <[email protected]>
README.md
CHANGED
@@ -5685,6 +5685,46 @@ In addition to the open-source [GTE](https://huggingface.co/collections/Alibaba-
|
|
5685 |
|
5686 |
Note that the models behind the commercial APIs are not entirely identical to the open-source models.
|
5687 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5688 |
## Citation
|
5689 |
|
5690 |
If you find our paper or models helpful, please consider cite:
|
|
|
5685 |
|
5686 |
Note that the models behind the commercial APIs are not entirely identical to the open-source models.
|
5687 |
|
5688 |
+
## Community support
|
5689 |
+
|
5690 |
+
### Fine-tuning
|
5691 |
+
|
5692 |
+
GTE models can be fine-tuned with a third party framework SWIFT.
|
5693 |
+
|
5694 |
+
```shell
|
5695 |
+
pip install ms-swift -U
|
5696 |
+
```
|
5697 |
+
|
5698 |
+
```shell
|
5699 |
+
# check: https://swift.readthedocs.io/en/latest/BestPractices/Embedding.html
|
5700 |
+
nproc_per_node=8
|
5701 |
+
NPROC_PER_NODE=$nproc_per_node \
|
5702 |
+
USE_HF=1 \
|
5703 |
+
swift sft \
|
5704 |
+
--model Alibaba-NLP/gte-Qwen2-7B-instruct \
|
5705 |
+
--train_type lora \
|
5706 |
+
--dataset 'sentence-transformers/stsb' \
|
5707 |
+
--torch_dtype bfloat16 \
|
5708 |
+
--num_train_epochs 10 \
|
5709 |
+
--per_device_train_batch_size 2 \
|
5710 |
+
--per_device_eval_batch_size 1 \
|
5711 |
+
--gradient_accumulation_steps $(expr 64 / $nproc_per_node) \
|
5712 |
+
--eval_steps 100 \
|
5713 |
+
--save_steps 100 \
|
5714 |
+
--eval_strategy steps \
|
5715 |
+
--use_chat_template false \
|
5716 |
+
--save_total_limit 5 \
|
5717 |
+
--logging_steps 5 \
|
5718 |
+
--output_dir output \
|
5719 |
+
--warmup_ratio 0.05 \
|
5720 |
+
--learning_rate 5e-6 \
|
5721 |
+
--deepspeed zero3 \
|
5722 |
+
--dataloader_num_workers 4 \
|
5723 |
+
--task_type embedding \
|
5724 |
+
--loss_type cosine_similarity \
|
5725 |
+
--dataloader_drop_last true
|
5726 |
+
```
|
5727 |
+
|
5728 |
## Citation
|
5729 |
|
5730 |
If you find our paper or models helpful, please consider cite:
|