Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,26 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
* Quantization of Qwen2.5 14B for edge devices 7.3Gb footprint
|
6 |
+
* One of the best models I tried in Spanish.
|
7 |
+
* Original model: https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5
|
8 |
+
* Models Merged
|
9 |
+
*
|
10 |
+
`
|
11 |
+
* huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
|
12 |
+
* allura-org/TQ2.5-14B-Aletheia-v1
|
13 |
+
* EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
|
14 |
+
* v000000/Qwen2.5-Lumen-14B
|
15 |
+
`
|
16 |
+
|
17 |
+
* All quants made using imatrix option with dataset from here
|
18 |
+
* Using llama.cpp compiled with CUDA support for quantization and inference:
|
19 |
+
|
20 |
+
`
|
21 |
+
ggml_cuda_init: found 2 CUDA devices:
|
22 |
+
Device 0: NVIDIA GeForce RTX 4060 Ti, compute capability 8.9, VMM: yes
|
23 |
+
Device 1: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
|
24 |
+
version: 3982 (cc2983d3)
|
25 |
+
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
|
26 |
+
`
|