dfurman commited on
Commit
3eeff15
·
1 Parent(s): 256b1cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -25,13 +25,13 @@ This instruction model was built via parameter-efficient QLoRA finetuning of [Ll
25
 
26
  | Metric | Value |
27
  |-----------------------|-------|
28
- | MMLU (5-shot) | Coming |
29
- | ARC (25-shot) | Coming |
30
- | HellaSwag (10-shot) | Coming |
31
- | TruthfulQA (0-shot) | Coming |
32
- | Avg. | Coming |
33
 
34
- We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
35
 
36
  ## Helpful links
37
 
 
25
 
26
  | Metric | Value |
27
  |-----------------------|-------|
28
+ | MMLU (5-shot) | 46.63 |
29
+ | ARC (25-shot) | 51.19 |
30
+ | HellaSwag (10-shot) | 78.92 |
31
+ | TruthfulQA (0-shot) | 48.5 |
32
+ | Avg. | 56.31 |
33
 
34
+ We use the [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
35
 
36
  ## Helpful links
37