Text Generation
PEFT
Safetensors
mistral
conversational
Eval Results
dfurman commited on
Commit
fdac9cc
·
1 Parent(s): 5d17f6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -13,6 +13,20 @@ pipeline_tag: text-generation
13
 
14
  This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-13b](https://huggingface.co/meta-llama/Llama-2-13b-hf) on the first 100k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single A6000 (48 GB) for roughly 18 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  * Model license: Llama 2 Community License Agreement
17
  * Basic usage: [notebook](assets/basic_inference_llama_2_13b_dolphin.ipynb)
18
  * Finetuning script: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-13b-dolphin-peft.py)
 
13
 
14
  This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-13b](https://huggingface.co/meta-llama/Llama-2-13b-hf) on the first 100k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single A6000 (48 GB) for roughly 18 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
15
 
16
+ ### Benchmark metrics
17
+
18
+ | Metric | Value |
19
+ |-----------------------|-------|
20
+ | MMLU (5-shot) | 55.76 |
21
+ | ARC (25-shot) | 59.98 |
22
+ | HellaSwag (10-shot) | 82.43 |
23
+ | TruthfulQA (0-shot) | 42.59 |
24
+ | Avg. | 60.19 |
25
+
26
+ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
27
+
28
+ ### Helpful Links
29
+
30
  * Model license: Llama 2 Community License Agreement
31
  * Basic usage: [notebook](assets/basic_inference_llama_2_13b_dolphin.ipynb)
32
  * Finetuning script: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-13b-dolphin-peft.py)