jrc commited on
Commit
72799dd
·
verified ·
1 Parent(s): 69ed273

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -1
README.md CHANGED
@@ -9,4 +9,87 @@ tags:
9
  - minerva-math
10
  library_name: transformers
11
  pipeline_tag: text-generation
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - minerva-math
10
  library_name: transformers
11
  pipeline_tag: text-generation
12
+ ---
13
+
14
+ # jrc/phi3-mini-math
15
+
16
+ <!-- Provide a quick summary of what the model is/does. -->
17
+
18
+ Phi-3 Mini 4k Instruct model finetuned on math datasets.
19
+
20
+ ## Uses
21
+
22
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
23
+
24
+ ## How to Get Started with the Model
25
+
26
+ Use the code below to get started with the model.
27
+
28
+ [More Information Needed]
29
+
30
+ ## Training Details
31
+
32
+ ### Training Data
33
+
34
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
35
+
36
+ [More Information Needed]
37
+
38
+ ### Training Procedure
39
+
40
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
41
+
42
+ ## Evaluation
43
+
44
+ The finetuned model is evaluated on [minerva-math]() using [EleutherAI Eval Harness]() through torchtune:
45
+
46
+ CMD: `tune run eleuther_eval --config eleuther_evaluation checkpoint.checkpoint_dir=./lora-phi3-math tasks=["minerva_math"] batch_size=32
47
+
48
+ RESULTS:
49
+ ```
50
+ | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
51
+ |------------------------------------|-------|------|-----:|-----------|-----:|---|-----:|
52
+ |minerva_math |N/A |none | 4|exact_match|0.1670|± |0.0051|
53
+ | - minerva_math_algebra | 1|none | 4|exact_match|0.2502|± |0.0126|
54
+ | - minerva_math_counting_and_prob | 1|none | 4|exact_match|0.1329|± |0.0156|
55
+ | - minerva_math_geometry | 1|none | 4|exact_match|0.1232|± |0.0150|
56
+ | - minerva_math_intermediate_algebra| 1|none | 4|exact_match|0.0576|± |0.0078|
57
+ | - minerva_math_num_theory | 1|none | 4|exact_match|0.1148|± |0.0137|
58
+ | - minerva_math_prealgebra | 1|none | 4|exact_match|0.3077|± |0.0156|
59
+ | - minerva_math_precalc | 1|none | 4|exact_match|0.0623|± |0.0104|
60
+ ```
61
+
62
+
63
+ ## Technical Specifications [optional]
64
+
65
+ #### Hardware
66
+
67
+ 4 x NVIDIA A100 GPUs
68
+
69
+ Max VRAM used per GPU: 29 GB
70
+
71
+ ## Citation [optional]
72
+
73
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
74
+
75
+ **BibTeX:**
76
+
77
+ [More Information Needed]
78
+
79
+ **APA:**
80
+
81
+ [More Information Needed]
82
+
83
+ ## Glossary [optional]
84
+
85
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
86
+
87
+ [More Information Needed]
88
+
89
+ ## More Information [optional]
90
+
91
+ [More Information Needed]
92
+
93
+ ## Model Card Contact
94
+
95
+ [More Information Needed]