kaiokendev commited on
Commit
b8e6174
·
1 Parent(s): 8e78ead

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -7,8 +7,16 @@ license: mit
7
  This is a second prototype of SuperHOT, this time with 16K context and no RLHF, using the same technique described in [the github blog](https://kaiokendev.github.io/til#extending-context-to-8k).
8
  Tests have shown that the model does indeed leverage the extended context at 8K, so naturally, let's try going even further.
9
 
 
 
 
 
 
10
  You will need to **use either the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.125 and the maximum sequence length to 16384**
11
 
 
 
 
12
  I trained the LoRA with the following configuration:
13
  - 1200 samples (~400 samples over 2048 sequence length)
14
  - learning rate of 3e-4
 
7
  This is a second prototype of SuperHOT, this time with 16K context and no RLHF, using the same technique described in [the github blog](https://kaiokendev.github.io/til#extending-context-to-8k).
8
  Tests have shown that the model does indeed leverage the extended context at 8K, so naturally, let's try going even further.
9
 
10
+ #### Looking for Merged & Quantized Models?
11
+ - 13B 16K GGML: [tmpupload/superhot-13b-16k-no-rlhf-test-GGML](https://huggingface.co/tmpupload/superhot-13b-16k-no-rlhf-test-GGML)
12
+ - 13B 16K CUDA (no groupsize): [tmpupload/superhot-13b-16k-no-rlhf-test-GPTQ](https://huggingface.co/tmpupload/superhot-13b-16k-no-rlhf-test-GPTQ)
13
+
14
+ #### Using the monkey-patch?
15
  You will need to **use either the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.125 and the maximum sequence length to 16384**
16
 
17
+ #### Using Oobabooga or Exllama?
18
+ - `python server.py --max_seq_len 16384 --compress_pos_emb 8 --loader exllama_hf`
19
+
20
  I trained the LoRA with the following configuration:
21
  - 1200 samples (~400 samples over 2048 sequence length)
22
  - learning rate of 3e-4