bhenrym14 commited on
Commit
9cf7705
·
1 Parent(s): d9d59af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -9
README.md CHANGED
@@ -2,18 +2,15 @@
2
  datasets:
3
  - jondurbin/airoboros-gpt4-1.4.1
4
  ---
5
- RoPE Scaled QLoRA Finetune of airoboros-33b-gpt4-1.4.1 (LoRA)
6
 
7
  Full model card and GPTQ 4bit quantized weights can be found here: https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ
8
 
 
9
 
10
-
11
- Overview
12
-
13
- This is [Jon Durbin's Airoboros 33B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-33b-gpt4-1.4) (LoRA weights), trained with several key modifications:
14
-
15
- Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-33b.
16
- Training sequences beyond 2048 have the target truncated to equal 2048.
17
- Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
18
 
19
  Otherwise, I emulated the training process as closely as possible (rank 64 QLoRA) It was trained on 1x RTX 6000 Ada for ~43 hours.
 
2
  datasets:
3
  - jondurbin/airoboros-gpt4-1.4.1
4
  ---
5
+ # RoPE Scaled QLoRA Finetune of airoboros-33b-gpt4-1.4.1 (LoRA)
6
 
7
  Full model card and GPTQ 4bit quantized weights can be found here: https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ
8
 
9
+ ## Overview
10
 
11
+ This is [Jon Durbin's Airoboros 33B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-33b-gpt4-1.4) (LoRA weights) with several key modifications:
12
+ - Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-33b.
13
+ - Training sequences beyond 2048 have the target truncated to equal 2048.
14
+ - Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
 
 
 
 
15
 
16
  Otherwise, I emulated the training process as closely as possible (rank 64 QLoRA) It was trained on 1x RTX 6000 Ada for ~43 hours.