gobean commited on
Commit
4a22db4
·
verified ·
1 Parent(s): c04a506

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -1,12 +1,13 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
4
  This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
5
 
6
- I'm adding both q4-k-m and q5-k-m this time since it's a big model. On my 4090, q4-k-m is twice as face a q5-k-m without noticeable difference in chat or information quality. The speed of q5-k-m on my desktop computer is unusable, q4-k-m recommended.
7
 
8
- The quantized gguf was downloaded straight from [TheBloke](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF) this time,
9
- and then zipped into a llamafile using [Mozilla's awesome project](https://github.com/Mozilla-Ocho/llamafile).
10
 
11
  It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.
12
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ Coming soon! just learned about thebloke's quant issues, will update later.
5
+
6
+
7
+
8
  This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
9
 
 
10
 
 
 
11
 
12
  It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.
13