p1atdev commited on
Commit
7bcbaa9
·
verified ·
1 Parent(s): ddd497a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -3
README.md CHANGED
@@ -1,3 +1,23 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - THUDM/CogView4-6B
5
+ base_model_relation: quantized
6
+ tags:
7
+ - quanto
8
+ ---
9
+ ## Quantization settings
10
+
11
+ - `vae.`: `torch.bfloat16`. No quantization.
12
+ - `text_encoder.layers.`:
13
+ - Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto)
14
+ - Target layers:`["q_proj", "k_proj", "v_proj", "o_proj", "mlp.down_proj", "mlp.gate_up_proj"]`
15
+ - `diffusion_model.`:
16
+ - Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto)
17
+ - Target layers: `["to_q", "to_k", "to_v", "to_out.0", "ff.net.0.proj", "ff.net.2"]`
18
+
19
+ ## VRAM cosumption
20
+
21
+ - Text encoder (`text_encoder.`): about 11 GB
22
+ - Denoiser (`diffusion_model.`): about 10 GB
23
+ - VAE (`vae.`): about 1.5 GB