turboderp commited on
Commit
f7887d4
·
verified ·
1 Parent(s): 5319c59

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EXL3 quants of [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)
2
+
3
+ At the moment these are all converted with 8-bpw output layers. Currently investigating why there's a small-but-noticeable drop in accuracy at 6-bpw. Likely it has to do with the tied embeddings.
4
+
5
+ [2.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/2.0bpw)
6
+ [2.25 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/2.25bpw)
7
+ [2.50 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/2.5bpw)
8
+ [3.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/3.0bpw)
9
+ [3.50 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/3.5bpw)
10
+ [4.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/4.0bpw)
11
+ [5.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/5.0bpw)
12
+ [6.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/6.0bpw)
13
+ [8.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/8.0bpw)