jrc commited on
Commit
dc1a633
·
verified ·
1 Parent(s): 2e99b69

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -18
README.md CHANGED
@@ -15,11 +15,7 @@ pipeline_tag: text-generation
15
 
16
  <!-- Provide a quick summary of what the model is/does. -->
17
 
18
- Phi-3 Mini 4k Instruct model finetuned on math datasets.
19
-
20
- ## Uses
21
-
22
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
23
 
24
  ## How to Get Started with the Model
25
 
@@ -35,7 +31,7 @@ model = AutoModelForCausalLM.from_pretrained("jrc/phi3-mini-math", trust_remote_
35
 
36
  ## Training Details
37
 
38
- Phi3 was trained using [torchtune]() and the training script + config file are located in this repository.
39
 
40
  ```bash
41
  tune run lora_finetune_distributed.py --config mini_lora.yaml
@@ -45,11 +41,17 @@ tune run lora_finetune_distributed.py --config mini_lora.yaml
45
 
46
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
47
 
48
- [More Information Needed]
 
 
 
49
 
50
- ### Training Procedure
 
 
51
 
52
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
53
 
54
  ## Evaluation
55
 
@@ -75,15 +77,6 @@ tune run eleuther_eval --config eleuther_evaluation \
75
  | - minerva_math_precalc | 1|none | 4|exact_match|0.0623|± |0.0104|
76
 
77
 
78
-
79
- ## Technical Specifications [optional]
80
-
81
- #### Hardware
82
-
83
- 4 x NVIDIA A100 GPUs
84
-
85
- Max VRAM used per GPU: 29 GB
86
-
87
  ## Model Card Contact
88
 
89
  [More Information Needed]
 
15
 
16
  <!-- Provide a quick summary of what the model is/does. -->
17
 
18
+ Math majors - who needs em? This model can answer any math questions you have.
 
 
 
 
19
 
20
  ## How to Get Started with the Model
21
 
 
31
 
32
  ## Training Details
33
 
34
+ Phi3 was trained using [torchtune](https://github.com/pytorch/torchtune) and the training script + config file are located in this repository.
35
 
36
  ```bash
37
  tune run lora_finetune_distributed.py --config mini_lora.yaml
 
41
 
42
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
43
 
44
+ This model was finetuned on the following datasets:
45
+
46
+ * TIGER-Lab/MATH-plus: An advanced math-specific dataset with 894k samples.
47
+
48
 
49
+ #### Hardware
50
+
51
+ 4 x NVIDIA A100 GPUs
52
 
53
+ Max VRAM used per GPU: 29 GB
54
+ Real time: 12 hours
55
 
56
  ## Evaluation
57
 
 
77
  | - minerva_math_precalc | 1|none | 4|exact_match|0.0623|± |0.0104|
78
 
79
 
 
 
 
 
 
 
 
 
 
80
  ## Model Card Contact
81
 
82
  [More Information Needed]