Commit
·
a64eeb1
1
Parent(s):
7fde4e7
Update README.md
Browse files
README.md
CHANGED
@@ -53,14 +53,17 @@ We evaluated model_001 on a wide range of tasks using [Language Model Evaluation
|
|
53 |
|
54 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
55 |
|
56 |
-
|
57 |
-
|
58 |
-
|**Task**|**
|
59 |
-
|*
|
60 |
-
|*
|
61 |
-
|*
|
62 |
-
|*
|
63 |
-
|
|
|
|
|
|
|
64 |
|
65 |
|
66 |
<br>
|
|
|
53 |
|
54 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
55 |
|
56 |
+
|||
|
57 |
+
|:------:|:-------:|
|
58 |
+
|**Task**|**Value**|
|
59 |
+
|*ARC*|0.6869|
|
60 |
+
|*HellaSwag*|0.8642|
|
61 |
+
|*MMLU*|0.6992|
|
62 |
+
|*TruthfulQA*|0.5885|
|
63 |
+
|*Winogrande*|0.8208|
|
64 |
+
|*GSM8k*|0.4481|
|
65 |
+
|*DROP*|0.5510|
|
66 |
+
|**Total Average**|**0.6655**|
|
67 |
|
68 |
|
69 |
<br>
|