gobean
/

Mixtral-8x7B-Instruct-v0.1.llamafile

Model card Files Files and versions Community

gobean commited on Apr 3, 2024

Commit

4a22db4

·

verified ·

1 Parent(s): c04a506

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -1,12 +1,13 @@
 ---
 license: apache-2.0
 ---
 This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
-I'm adding both q4-k-m and q5-k-m this time since it's a big model. On my 4090, q4-k-m is twice as face a q5-k-m without noticeable difference in chat or information quality. The speed of q5-k-m on my desktop computer is unusable, q4-k-m recommended.
-The quantized gguf was downloaded straight from [TheBloke](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF) this time,
-and then zipped into a llamafile using [Mozilla's awesome project](https://github.com/Mozilla-Ocho/llamafile).
 It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.

 ---
 license: apache-2.0
 ---
+Coming soon! just learned about thebloke's quant issues, will update later.
 This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
 It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.