|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- THUDM/CogView4-6B |
|
base_model_relation: quantized |
|
tags: |
|
- quanto |
|
--- |
|
## Quantization settings |
|
|
|
- `vae.`: `torch.bfloat16`. No quantization. |
|
- `text_encoder.layers.`: |
|
- Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto) |
|
- Target layers:`["q_proj", "k_proj", "v_proj", "o_proj", "mlp.down_proj", "mlp.gate_up_proj"]` |
|
- `diffusion_model.`: |
|
- Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto) |
|
- Target layers: `["to_q", "to_k", "to_v", "to_out.0", "ff.net.0.proj", "ff.net.2"]` |
|
|
|
## VRAM cosumption |
|
|
|
- Text encoder (`text_encoder.`): about 11 GB |
|
- Denoiser (`diffusion_model.`): about 10 GB |
|
|
|
## Samples |
|
|
|
|
|
|`torch.bfloat16` | Quanto Int8 | |
|
| - | - | |
|
| <img src="./images/sample_bf16_01.jpg" width="320px" /> | <img src="./images/sample_quanto_01.jpg" width="320px" /> | |
|
| VRAM 40GB (without offloading) | VRAM 28GB (without offloading) | |
|
|
|
|
|
<details><summary>Generation parameters</summary> |
|
|
|
- prompt: `""" A photo of a nendoroid figure of hatsune miku holding a sign that says "CogView4" """"` |
|
- negative_prompt: `"blurry, low quality, horror"` |
|
- height: `1152` |
|
- width: `1152` |
|
- cfg_scale: `3.5` |
|
- num_inference_steps: `20` |
|
</details> |
|
|