bhenrym14
/

airoboros-33b-gpt4-1.4.1-PI-8192-LoRA

Model card Files Files and versions Community

bhenrym14 commited on Jul 3, 2023

Commit

9cf7705

·

1 Parent(s): d9d59af

Update README.md

Files changed (1) hide show

README.md +6 -9

README.md CHANGED Viewed

@@ -2,18 +2,15 @@
 datasets:
 - jondurbin/airoboros-gpt4-1.4.1
 ---
-RoPE Scaled QLoRA Finetune of airoboros-33b-gpt4-1.4.1 (LoRA)
 Full model card and GPTQ 4bit quantized weights can be found here: https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ
-Overview
-This is  [Jon Durbin's Airoboros 33B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-33b-gpt4-1.4) (LoRA weights), trained with several key modifications:
-    Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-33b.
-    Training sequences beyond 2048 have the target truncated to equal 2048.
-    Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
 Otherwise, I emulated the training process as closely as possible (rank 64 QLoRA) It was trained on 1x RTX 6000 Ada for ~43 hours.

 datasets:
 - jondurbin/airoboros-gpt4-1.4.1
 ---
+# RoPE Scaled QLoRA Finetune of airoboros-33b-gpt4-1.4.1 (LoRA)
 Full model card and GPTQ 4bit quantized weights can be found here: https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ
+## Overview
+This is [Jon Durbin's Airoboros 33B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-33b-gpt4-1.4) (LoRA weights) with several key modifications:
+- Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-33b.
+- Training sequences beyond 2048 have the target truncated to equal 2048.
+- Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
 Otherwise, I emulated the training process as closely as possible (rank 64 QLoRA) It was trained on 1x RTX 6000 Ada for ~43 hours.