turboderp
/

Phi-4-mini-instruct-exl3

Model card Files Files and versions Community

turboderp commited on 24 days ago

Commit

f7887d4

·

verified ·

1 Parent(s): 5319c59

Create README.md

Files changed (1) hide show

README.md +13 -0

README.md ADDED Viewed

	@@ -0,0 +1,13 @@

+EXL3 quants of [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)
+At the moment these are all converted with 8-bpw output layers. Currently investigating why there's a small-but-noticeable drop in accuracy at 6-bpw. Likely it has to do with the tied embeddings.
+[2.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/2.0bpw)
+[2.25 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/2.25bpw)
+[2.50 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/2.5bpw)
+[3.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/3.0bpw)
+[3.50 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/3.5bpw)
+[4.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/4.0bpw)
+[5.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/5.0bpw)
+[6.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/6.0bpw)
+[8.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/8.0bpw)