Commit
·
a64eeb1
1
Parent(s):
7fde4e7
Update README.md
Browse files
README.md
CHANGED
|
@@ -53,14 +53,17 @@ We evaluated model_001 on a wide range of tasks using [Language Model Evaluation
|
|
| 53 |
|
| 54 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 55 |
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|**Task**|**
|
| 59 |
-
|*
|
| 60 |
-
|*
|
| 61 |
-
|*
|
| 62 |
-
|*
|
| 63 |
-
|
|
|
|
|
|
|
|
|
|
| 64 |
|
| 65 |
|
| 66 |
<br>
|
|
|
|
| 53 |
|
| 54 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 55 |
|
| 56 |
+
|||
|
| 57 |
+
|:------:|:-------:|
|
| 58 |
+
|**Task**|**Value**|
|
| 59 |
+
|*ARC*|0.6869|
|
| 60 |
+
|*HellaSwag*|0.8642|
|
| 61 |
+
|*MMLU*|0.6992|
|
| 62 |
+
|*TruthfulQA*|0.5885|
|
| 63 |
+
|*Winogrande*|0.8208|
|
| 64 |
+
|*GSM8k*|0.4481|
|
| 65 |
+
|*DROP*|0.5510|
|
| 66 |
+
|**Total Average**|**0.6655**|
|
| 67 |
|
| 68 |
|
| 69 |
<br>
|