|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1). |
|
|
|
These were converted and quantized from source safetensors using llama.cpp on April 3, 2024. |
|
This matters because there are several GGUF files on HF which were created before llama.cpp's support for MoE quantization was fully debugged, |
|
even though it looked like it was producing working files at the time. |
|
|
|
I'll be uploading the quantized .gguf sources I created as well if anyone wants them as a reference or for further work. |
|
|
|
|
|
-= Llamafile =- |
|
|
|
Llamafiles are a standalone executable that run an LLM server locally on a variety of operating systems including FreeBSD, Windows, Windows via WSL, Linux, and Mac. |
|
The same file works everywhere, I've tested several of these on FreeBSD, Windows, Windows via WSL, and Linux. |
|
You just download the .llamafile, (chmod +x or rename to .exe as needed), run it, open the chat interface in a browser, and interact. |
|
Options can be passed in to expose the api etc. See their [docs](https://github.com/Mozilla-Ocho/llamafile) for details. |
|
|
|
[Mozilla Blog Announcement for Llamafile](https://hacks.mozilla.org/2023/11/introducing-llamafile/) |
|
|
|
|
|
- Windows note: If it's over 4gb and you want to use it on Windows, you'll have to run it from WSL. |
|
|
|
- WSL note: If you get the error about APE, and the recommended command |
|
|
|
`sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop'` |
|
|
|
doesn't work, the WSLInterop file might be named something else. I had success with |
|
|
|
`sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop-late'` |
|
|
|
If that fails too, just navigate to `/proc/sys/fs/binfmt_msc` and see what files look like `WSLInterop` |
|
and echo a -1 to whatever they're called by changing that part of the recommended command. |
|
|
|
|
|
- FreeBSD note: Yes, it actually works on a fresh install of FreeBSD. |