yniu commited on
Commit
45f361d
·
verified ·
1 Parent(s): e9c474b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -40
README.md CHANGED
@@ -1,40 +1,42 @@
1
- ---
2
- license: mit
3
- ---
4
-
5
-
6
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
7
- should probably proofread and complete it, then remove this comment. -->
8
-
9
- # StableToolBench-MirrorAPI
10
-
11
- This model is a fine-tuned version of Qwen2.5-7B-Instruct
12
-
13
- ## Training and evaluation data
14
-
15
- ## Training procedure
16
-
17
- ### Training hyperparameters
18
-
19
- The following hyperparameters were used during training:
20
- - learning_rate: 2e-05
21
- - train_batch_size: 2
22
- - eval_batch_size: 2
23
- - seed: 42
24
- - distributed_type: multi-GPU
25
- - num_devices: 8
26
- - gradient_accumulation_steps: 8
27
- - total_train_batch_size: 128
28
- - total_eval_batch_size: 16
29
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
30
- - lr_scheduler_type: cosine
31
- - lr_scheduler_warmup_ratio: 0.04
32
- - lr_scheduler_warmup_steps: 100
33
- - num_epochs: 5.0
34
-
35
- ### Framework versions
36
-
37
- - Transformers 4.44.2
38
- - Pytorch 2.4.1+cu118
39
- - Datasets 2.21.0
40
- - Tokenizers 0.19.1
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+
6
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
7
+ should probably proofread and complete it, then remove this comment. -->
8
+
9
+ # StableToolBench-MirrorAPI
10
+
11
+ This model is a fine-tuned version of [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
12
+
13
+ ### Training data
14
+ [The training data of MirrorAPI](https://huggingface.co/datasets/stabletoolbench/MirrorAPI/tree/main/train) consists of:
15
+ - train_sft.json
16
+ - train_cot.json
17
+ - train_augment.json
18
+
19
+ ### Training hyperparameters
20
+
21
+ The following hyperparameters were used during training:
22
+ - learning_rate: 2e-05
23
+ - train_batch_size: 2
24
+ - eval_batch_size: 2
25
+ - seed: 42
26
+ - distributed_type: multi-GPU
27
+ - num_devices: 8
28
+ - gradient_accumulation_steps: 8
29
+ - total_train_batch_size: 128
30
+ - total_eval_batch_size: 16
31
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
32
+ - lr_scheduler_type: cosine
33
+ - lr_scheduler_warmup_ratio: 0.04
34
+ - lr_scheduler_warmup_steps: 100
35
+ - num_epochs: 5.0
36
+
37
+ ### Framework versions
38
+
39
+ - Transformers 4.44.2
40
+ - Pytorch 2.4.1+cu118
41
+ - Datasets 2.21.0
42
+ - Tokenizers 0.19.1