pszemraj commited on
Commit
b5b4be0
·
verified ·
1 Parent(s): 534089b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -23,6 +23,19 @@ cargo run --example quantized-t5 --release -- \
23
 
24
  On my laptop (CPU, running in WSL) I get: `45 tokens generated (0.48 token/s)`
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  ## setup
28
 
 
23
 
24
  On my laptop (CPU, running in WSL) I get: `45 tokens generated (0.48 token/s)`
25
 
26
+ ## weights
27
+
28
+
29
+ Below are the weights/file names in this repo:
30
+
31
+ | Weight File Name | Quant Format | Size (GB) |
32
+ |-------------------------|--------------|-----------|
33
+ | flan-ul2-q2k.gguf | q2k | 6.39 |
34
+ | flan-ul2-q3k.gguf | q3k | 8.36 |
35
+ | flan-ul2-q4k.gguf | q4k | 10.9 |
36
+ | flan-ul2-q6k.gguf | q6k | 16 |
37
+
38
+ From initial testing, it appears that q2k is too low precision and produces poor/incoherent output. The `q3k` and higher are coherent.
39
 
40
  ## setup
41