pankajmathur commited on
Commit
a64eeb1
·
1 Parent(s): 7fde4e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -8
README.md CHANGED
@@ -53,14 +53,17 @@ We evaluated model_001 on a wide range of tasks using [Language Model Evaluation
53
 
54
  Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
55
 
56
- |||||
57
- |:------:|:--------:|:-------:|:--------:|
58
- |**Task**|**Metric**|**Value**|**Stderr**|
59
- |*arc_challenge*|acc_norm|0.7108|0.0141|
60
- |*hellaswag*|acc_norm|0.8765|0.0038|
61
- |*mmlu*|acc_norm|0.6904|0.0351|
62
- |*truthfulqa_mc*|mc2|0.6312|0.0157|
63
- |**Total Average**|-|**0.72729**||
 
 
 
64
 
65
 
66
  <br>
 
53
 
54
  Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
55
 
56
+ |||
57
+ |:------:|:-------:|
58
+ |**Task**|**Value**|
59
+ |*ARC*|0.6869|
60
+ |*HellaSwag*|0.8642|
61
+ |*MMLU*|0.6992|
62
+ |*TruthfulQA*|0.5885|
63
+ |*Winogrande*|0.8208|
64
+ |*GSM8k*|0.4481|
65
+ |*DROP*|0.5510|
66
+ |**Total Average**|**0.6655**|
67
 
68
 
69
  <br>