namespace-Pt commited on
Commit
b4e2b3e
·
verified ·
1 Parent(s): d9b60fd

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: text-generation
6
  <div align="center">
7
  <h1>Llama-3-8B-Instruct-80K-QLoRA</h1>
8
 
9
- <a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/">[Data&Code]</a>
10
  </div>
11
 
12
  We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K long-context training data synthesized from GPT-4. The entire training cycle is super efficient, which takes 8 hours on a 8xA800 (80G) machine. Yet, the resulted model achieves remarkable performance on a series of downstream long-context evaluation benchmarks.
@@ -14,7 +14,7 @@ We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K
14
 
15
  # Evaluation
16
 
17
- All the following evaluation results can be reproduced following instructions [here](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon/new/docs/llama3-8b-instruct-qlora-80k.md).
18
 
19
  ## Needle in a Haystack
20
  We evaluate the model on the Needle-In-A-HayStack task using the official setting. The blue vertical line indicates the training context length, i.e. 80K.
 
6
  <div align="center">
7
  <h1>Llama-3-8B-Instruct-80K-QLoRA</h1>
8
 
9
+ <a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/longllm_qlora">[Data&Code]</a>
10
  </div>
11
 
12
  We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K long-context training data synthesized from GPT-4. The entire training cycle is super efficient, which takes 8 hours on a 8xA800 (80G) machine. Yet, the resulted model achieves remarkable performance on a series of downstream long-context evaluation benchmarks.
 
14
 
15
  # Evaluation
16
 
17
+ All the following evaluation results can be reproduced following instructions [here](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/longllm_qlora).
18
 
19
  ## Needle in a Haystack
20
  We evaluate the model on the Needle-In-A-HayStack task using the official setting. The blue vertical line indicates the training context length, i.e. 80K.