Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ datasets:
|
|
4 |
- nvidia/OpenMathInstruct-1
|
5 |
language:
|
6 |
- en
|
7 |
-
library_name:
|
8 |
tags:
|
9 |
- nvidia
|
10 |
- code
|
@@ -14,31 +14,81 @@ tags:
|
|
14 |
|
15 |
# OpenMath-CodeLlama-7b-Python
|
16 |
|
17 |
-
## Description:
|
18 |
-
|
19 |
OpenMath models were designed to solve mathematical problems by integrating text-based reasoning with code blocks
|
20 |
executed by Python interpreter. The models were trained on [OpenMathInstruct-1](https://huggingface.co/datasets/nvidia/OpenMathInstruct-1),
|
21 |
a math instruction tuning dataset with 1.8M problem-solution pairs generated using permissively licensed
|
22 |
[Mixtral-8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) model.
|
23 |
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
The pipeline we used to produce these models is fully open-sourced under a commercially permissive license.
|
44 |
|
@@ -69,7 +119,7 @@ Please see [NeMo-Skills Github repo](https://github.com/Kipok/NeMo-Skills) for t
|
|
69 |
|
70 |
## Contact
|
71 |
|
72 |
-
E-Mail
|
73 |
|
74 |
## Citation
|
75 |
|
@@ -78,4 +128,5 @@ If you find this model useful, please cite the following works
|
|
78 |
TODO
|
79 |
|
80 |
## License
|
|
|
81 |
The use of this model is governed by the [Llama 2 Community License Agreement](https://ai.meta.com/llama/license/)
|
|
|
4 |
- nvidia/OpenMathInstruct-1
|
5 |
language:
|
6 |
- en
|
7 |
+
library_name: nemo
|
8 |
tags:
|
9 |
- nvidia
|
10 |
- code
|
|
|
14 |
|
15 |
# OpenMath-CodeLlama-7b-Python
|
16 |
|
|
|
|
|
17 |
OpenMath models were designed to solve mathematical problems by integrating text-based reasoning with code blocks
|
18 |
executed by Python interpreter. The models were trained on [OpenMathInstruct-1](https://huggingface.co/datasets/nvidia/OpenMathInstruct-1),
|
19 |
a math instruction tuning dataset with 1.8M problem-solution pairs generated using permissively licensed
|
20 |
[Mixtral-8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) model.
|
21 |
|
22 |
+
<table border="1">
|
23 |
+
<tr>
|
24 |
+
<td></td>
|
25 |
+
<td colspan="2" style="text-align: center;">greedy</td>
|
26 |
+
<td colspan="2" style="text-align: center;">majority@50</td>
|
27 |
+
</tr>
|
28 |
+
<tr>
|
29 |
+
<td style="text-align: center;">model</td>
|
30 |
+
<td style="text-align: center;">GSM8K</td>
|
31 |
+
<td style="text-align: center;">MATH</td>
|
32 |
+
<td style="text-align: center;">GMS8K</td>
|
33 |
+
<td style="text-align: center;">MATH</td>
|
34 |
+
</tr>
|
35 |
+
<tr>
|
36 |
+
<td style="text-align: right;">GPT-4 <a href="https://arxiv.org/abs/2312.08935">[1]</a></td>
|
37 |
+
<td style="text-align: center;">94.4</td>
|
38 |
+
<td style="text-align: center;">56.2</td>
|
39 |
+
<td style="text-align: center;">-</td>
|
40 |
+
<td style="text-align: center;">-</td>
|
41 |
+
</tr>
|
42 |
+
<tr>
|
43 |
+
<td style="text-align: right;">GPT-4 + code <a href="https://arxiv.org/abs/2308.07921v1">[2]</a></td>
|
44 |
+
<td style="text-align: center;">92.9</td>
|
45 |
+
<td style="text-align: center;">69.7</td>
|
46 |
+
<td style="text-align: center;">-</td>
|
47 |
+
<td style="text-align: center;">-</td>
|
48 |
+
</tr>
|
49 |
+
<tr>
|
50 |
+
<td style="text-align: right;">OpenMath-CodeLlama-7B (<a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-7b-Python">nemo</a> | <a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-7b-Python-hf">HF</a>)</td>
|
51 |
+
<td style="text-align: center;">75.9</td>
|
52 |
+
<td style="text-align: center;">43.6</td>
|
53 |
+
<td style="text-align: center;">84.8</td>
|
54 |
+
<td style="text-align: center;">55.6</td>
|
55 |
+
</tr>
|
56 |
+
<tr>
|
57 |
+
<td style="text-align: right;">OpenMath-Mistral-7B (<a href="https://huggingface.co/nvidia/OpenMath-Mistral-7B-v0.1">nemo</a> | <a href="https://huggingface.co/nvidia/OpenMath-Mistral-7B-v0.1-hf">HF</a>)</td>
|
58 |
+
<td style="text-align: center;">80.2</td>
|
59 |
+
<td style="text-align: center;">44.5</td>
|
60 |
+
<td style="text-align: center;">86.9</td>
|
61 |
+
<td style="text-align: center;">57.2</td>
|
62 |
+
</tr>
|
63 |
+
<tr>
|
64 |
+
<td style="text-align: right;">OpenMath-CodeLlama-13B (<a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-13b-Python">nemo</a> | <a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-13b-Python-hf">HF</a>)</td>
|
65 |
+
<td style="text-align: center;">78.8</td>
|
66 |
+
<td style="text-align: center;">45.5</td>
|
67 |
+
<td style="text-align: center;">86.8</td>
|
68 |
+
<td style="text-align: center;">57.6</td>
|
69 |
+
</tr>
|
70 |
+
<tr>
|
71 |
+
<td style="text-align: right;">OpenMath-CodeLlama-34B (<a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-34b-Python">nemo</a> | <a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-34b-Python-hf">HF</a>)</td>
|
72 |
+
<td style="text-align: center;">80.7</td>
|
73 |
+
<td style="text-align: center;">48.3</td>
|
74 |
+
<td style="text-align: center;">88.0</td>
|
75 |
+
<td style="text-align: center;">60.2</td>
|
76 |
+
</tr>
|
77 |
+
<tr>
|
78 |
+
<td style="text-align: right;">OpenMath-Llama2-70B (<a href="https://huggingface.co/nvidia/OpenMath-Llama-2-70b">nemo</a> | <a href="https://huggingface.co/nvidia/OpenMath-Llama-2-70b-hf">HF</a>)</td>
|
79 |
+
<td style="text-align: center;"><b>84.7</b></td>
|
80 |
+
<td style="text-align: center;">46.3</td>
|
81 |
+
<td style="text-align: center;">90.1</td>
|
82 |
+
<td style="text-align: center;">58.3</td>
|
83 |
+
</tr>
|
84 |
+
<tr>
|
85 |
+
<td style="text-align: right;">OpenMath-CodeLlama-70B (<a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-70b-Python">nemo</a> | <a href="https://huggingface.co/nvidia/OpenMath-CodeLlama-70b-Python-hf">HF</a>)</td>
|
86 |
+
<td style="text-align: center;">84.6</td>
|
87 |
+
<td style="text-align: center;"><b>50.7</b></td>
|
88 |
+
<td style="text-align: center;"><b>90.8</b></td>
|
89 |
+
<td style="text-align: center;"><b>60.4</b></td>
|
90 |
+
</tr>
|
91 |
+
</table>
|
92 |
|
93 |
The pipeline we used to produce these models is fully open-sourced under a commercially permissive license.
|
94 |
|
|
|
119 |
|
120 |
## Contact
|
121 |
|
122 |
+
E-Mail Igor Gitman at [email protected]
|
123 |
|
124 |
## Citation
|
125 |
|
|
|
128 |
TODO
|
129 |
|
130 |
## License
|
131 |
+
|
132 |
The use of this model is governed by the [Llama 2 Community License Agreement](https://ai.meta.com/llama/license/)
|