Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,23 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model:
|
4 |
+
- THUDM/CogView4-6B
|
5 |
+
base_model_relation: quantized
|
6 |
+
tags:
|
7 |
+
- quanto
|
8 |
+
---
|
9 |
+
## Quantization settings
|
10 |
+
|
11 |
+
- `vae.`: `torch.bfloat16`. No quantization.
|
12 |
+
- `text_encoder.layers.`:
|
13 |
+
- Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto)
|
14 |
+
- Target layers:`["q_proj", "k_proj", "v_proj", "o_proj", "mlp.down_proj", "mlp.gate_up_proj"]`
|
15 |
+
- `diffusion_model.`:
|
16 |
+
- Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto)
|
17 |
+
- Target layers: `["to_q", "to_k", "to_v", "to_out.0", "ff.net.0.proj", "ff.net.2"]`
|
18 |
+
|
19 |
+
## VRAM cosumption
|
20 |
+
|
21 |
+
- Text encoder (`text_encoder.`): about 11 GB
|
22 |
+
- Denoiser (`diffusion_model.`): about 10 GB
|
23 |
+
- VAE (`vae.`): about 1.5 GB
|