stabletoolbench
/

MirrorAPI

Model card Files Files and versions

yniu commited on Mar 1

Commit

45f361d

·

verified ·

1 Parent(s): e9c474b

Update README.md

Files changed (1) hide show

README.md +42 -40

README.md CHANGED Viewed

@@ -1,40 +1,42 @@
----
-license: mit
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# StableToolBench-MirrorAPI
-This model is a fine-tuned version of Qwen2.5-7B-Instruct
-## Training and evaluation data
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 2
-- eval_batch_size: 2
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 8
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 128
-- total_eval_batch_size: 16
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.04
-- lr_scheduler_warmup_steps: 100
-- num_epochs: 5.0
-### Framework versions
-- Transformers 4.44.2
-- Pytorch 2.4.1+cu118
-- Datasets 2.21.0
-- Tokenizers 0.19.1

+---
+license: mit
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# StableToolBench-MirrorAPI
+This model is a fine-tuned version of [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
+### Training data
+[The training data of MirrorAPI](https://huggingface.co/datasets/stabletoolbench/MirrorAPI/tree/main/train) consists of:
+- train_sft.json
+- train_cot.json
+- train_augment.json
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 128
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.04
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 5.0
+### Framework versions
+- Transformers 4.44.2
+- Pytorch 2.4.1+cu118
+- Datasets 2.21.0
+- Tokenizers 0.19.1