p1atdev
/

CogView4-6B-quanto_int8

Model card Files Files and versions

CogView4-6B-quanto_int8 / README.md

p1atdev's picture

Update README.md

b08d660 verified 4 months ago

|

history blame contribute delete

1.22 kB

	---
	license: apache-2.0
	base_model:
	- THUDM/CogView4-6B
	base_model_relation: quantized
	tags:
	- quanto
	---
	## Quantization settings

	- `vae.`: `torch.bfloat16`. No quantization.
	- `text_encoder.layers.`:
	- Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto)
	- Target layers:`["q_proj", "k_proj", "v_proj", "o_proj", "mlp.down_proj", "mlp.gate_up_proj"]`
	- `diffusion_model.`:
	- Int8 with [Optimum Quanto](https://github.com/huggingface/optimum-quanto)
	- Target layers: `["to_q", "to_k", "to_v", "to_out.0", "ff.net.0.proj", "ff.net.2"]`

	## VRAM cosumption

	- Text encoder (`text_encoder.`): about 11 GB
	- Denoiser (`diffusion_model.`): about 10 GB

	## Samples


	\|`torch.bfloat16` \| Quanto Int8 \|
	\| - \| - \|
	\| <img src="./images/sample_bf16_01.jpg" width="320px" /> \| <img src="./images/sample_quanto_01.jpg" width="320px" /> \|
	\| VRAM 40GB (without offloading) \| VRAM 28GB (without offloading) \|


	<details><summary>Generation parameters</summary>

	- prompt: `""" A photo of a nendoroid figure of hatsune miku holding a sign that says "CogView4" """"`
	- negative_prompt: `"blurry, low quality, horror"`
	- height: `1152`
	- width: `1152`
	- cfg_scale: `3.5`
	- num_inference_steps: `20`
	</details>