PursuitOfDataScience commited on
Commit
67f9c92
·
verified ·
1 Parent(s): eb949fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -1
README.md CHANGED
@@ -91,4 +91,72 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
91
 
92
  <pre>
93
  The meaning of life is tamed in many ways. It is a state of mental and physical development. It is a state of deep emotional strength and confidence, and it is a state of physical and mental balance. In this article, we will explore the meaning of life, the different ways life is defined, and how we can apply this concept to our own lives.
94
- </pre>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
 
92
  <pre>
93
  The meaning of life is tamed in many ways. It is a state of mental and physical development. It is a state of deep emotional strength and confidence, and it is a state of physical and mental balance. In this article, we will explore the meaning of life, the different ways life is defined, and how we can apply this concept to our own lives.
94
+ </pre>
95
+
96
+ # MMLU Evaluation Results
97
+
98
+ ## **Overall Accuracy**
99
+ **0.2549 (3579/14042)**
100
+
101
+ ## Breakdown by Subject:
102
+ | Subject | Accuracy (Correct/Total) |
103
+ |------------------------------|----------------------|
104
+ | international_law | 0.3636 (44/121) |
105
+ | anatomy | 0.3333 (45/135) |
106
+ | abstract_algebra | 0.3200 (32/100) |
107
+ | global_facts | 0.3100 (31/100) |
108
+ | high_school_computer_science | 0.3100 (31/100) |
109
+ | college_biology | 0.3056 (44/144) |
110
+ | philosophy | 0.3055 (95/311) |
111
+ | high_school_chemistry | 0.3054 (62/203) |
112
+ | world_religions | 0.3041 (52/171) |
113
+ | prehistory | 0.3025 (98/324) |
114
+ | high_school_european_history | 0.2970 (49/165) |
115
+ | elementary_mathematics | 0.2963 (112/378) |
116
+ | logical_fallacies | 0.2945 (48/163) |
117
+ | computer_security | 0.2900 (29/100) |
118
+ | professional_psychology | 0.2827 (173/612) |
119
+ | moral_disputes | 0.2803 (97/346) |
120
+ | professional_accounting | 0.2766 (78/282) |
121
+ | electrical_engineering | 0.2759 (40/145) |
122
+ | college_computer_science | 0.2700 (27/100) |
123
+ | college_mathematics | 0.2700 (27/100) |
124
+ | professional_law | 0.2692 (413/1534) |
125
+ | high_school_world_history | 0.2658 (63/237) |
126
+ | high_school_mathematics | 0.2630 (71/270) |
127
+ | high_school_geography | 0.2626 (52/198) |
128
+ | high_school_biology | 0.2613 (81/310) |
129
+ | astronomy | 0.2566 (39/152) |
130
+ | high_school_us_history | 0.2549 (52/204) |
131
+ | econometrics | 0.2544 (29/114) |
132
+ | college_medicine | 0.2543 (44/173) |
133
+ | us_foreign_policy | 0.2500 (25/100) |
134
+ | moral_scenarios | 0.2469 (221/895) |
135
+ | security_studies | 0.2449 (60/245) |
136
+ | miscellaneous | 0.2439 (191/783) |
137
+ | marketing | 0.2436 (57/234) |
138
+ | sociology | 0.2388 (48/201) |
139
+ | high_school_physics | 0.2384 (36/151) |
140
+ | human_aging | 0.2377 (53/223) |
141
+ | nutrition | 0.2353 (72/306) |
142
+ | jurisprudence | 0.2315 (25/108) |
143
+ | business_ethics | 0.2300 (23/100) |
144
+ | high_school_government_and_politics | 0.2280 (44/193) |
145
+ | college_physics | 0.2255 (23/102) |
146
+ | clinical_knowledge | 0.2226 (59/265) |
147
+ | high_school_statistics | 0.2222 (48/216) |
148
+ | high_school_psychology | 0.2220 (121/545) |
149
+ | conceptual_physics | 0.2213 (52/235) |
150
+ | public_relations | 0.2182 (24/110) |
151
+ | human_sexuality | 0.2137 (28/131) |
152
+ | college_chemistry | 0.2100 (21/100) |
153
+ | medical_genetics | 0.2100 (21/100) |
154
+ | high_school_microeconomics | 0.2059 (49/238) |
155
+ | machine_learning | 0.2054 (23/112) |
156
+ | high_school_macroeconomics | 0.2051 (80/390) |
157
+ | virology | 0.2048 (34/166) |
158
+ | management | 0.1845 (19/103) |
159
+ | professional_medicine | 0.1654 (45/272) |
160
+ | formal_logic | 0.1508 (19/126) |
161
+
162
+ ---