ecastera
/

Qwen2.5-Veltha-14B-0.5-Q3-gguf

Inference Endpoints

Model card Files Files and versions Community

ecastera commited on Jan 2

Commit

6f7e17b

·

verified ·

1 Parent(s): 9872d58

Update README.md

Files changed (1) hide show

README.md +5 -8

README.md CHANGED Viewed

@@ -5,14 +5,11 @@ license: apache-2.0
 * Quantization of Qwen2.5 14B for edge devices 7.3Gb footprint
 * One of the best models I tried in Spanish.
 * Original model: https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5
-* Models Merged
-*
-`
-* huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
-* allura-org/TQ2.5-14B-Aletheia-v1
-* EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
-* v000000/Qwen2.5-Lumen-14B
-`
 * All quants made using imatrix option with dataset from here
 * Using llama.cpp compiled with CUDA support for quantization and inference:

 * Quantization of Qwen2.5 14B for edge devices 7.3Gb footprint
 * One of the best models I tried in Spanish.
 * Original model: https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5
+* Models Merged:
+    * huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
+    * allura-org/TQ2.5-14B-Aletheia-v1
+    * EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
+    * v000000/Qwen2.5-Lumen-14B
 * All quants made using imatrix option with dataset from here
 * Using llama.cpp compiled with CUDA support for quantization and inference: