Update README.md
Browse files
README.md
CHANGED
@@ -1,12 +1,13 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
4 |
This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
5 |
|
6 |
-
I'm adding both q4-k-m and q5-k-m this time since it's a big model. On my 4090, q4-k-m is twice as face a q5-k-m without noticeable difference in chat or information quality. The speed of q5-k-m on my desktop computer is unusable, q4-k-m recommended.
|
7 |
|
8 |
-
The quantized gguf was downloaded straight from [TheBloke](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF) this time,
|
9 |
-
and then zipped into a llamafile using [Mozilla's awesome project](https://github.com/Mozilla-Ocho/llamafile).
|
10 |
|
11 |
It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.
|
12 |
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
Coming soon! just learned about thebloke's quant issues, will update later.
|
5 |
+
|
6 |
+
|
7 |
+
|
8 |
This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
9 |
|
|
|
10 |
|
|
|
|
|
11 |
|
12 |
It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.
|
13 |
|