Update README.md
Browse files
README.md
CHANGED
@@ -1,118 +1,28 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: other
|
|
|
3 |
license_name: flux-1-dev-non-commercial-license
|
|
|
4 |
tags:
|
5 |
- image-to-image
|
6 |
- SVDQuant
|
7 |
-
-
|
8 |
- FLUX.1
|
9 |
- Diffusion
|
10 |
- Quantization
|
11 |
-
- ControlNet
|
12 |
-
- depth-to-image
|
13 |
-
- image-generation
|
14 |
-
- text-to-image
|
15 |
- ICLR2025
|
16 |
-
- FLUX.1-Canny-dev
|
17 |
-
language:
|
18 |
-
- en
|
19 |
-
base_model:
|
20 |
-
- black-forest-labs/FLUX.1-Canny-dev
|
21 |
-
base_model_relation: quantized
|
22 |
-
pipeline_tag: image-to-image
|
23 |
-
datasets:
|
24 |
-
- mit-han-lab/svdquant-datasets
|
25 |
-
library_name: diffusers
|
26 |
-
---
|
27 |
-
|
28 |
-
<p align="center" style="border-radius: 10px">
|
29 |
-
<img src="https://github.com/mit-han-lab/nunchaku/raw/refs/heads/main/assets/logo.svg" width="50%" alt="logo"/>
|
30 |
-
</p>
|
31 |
-
<h4 style="display: flex; justify-content: center; align-items: center; text-align: center;">Quantization Library: <a href='https://github.com/mit-han-lab/deepcompressor'>DeepCompressor</a>   Inference Engine: <a href='https://github.com/mit-han-lab/nunchaku'>Nunchaku</a>
|
32 |
-
</h4>
|
33 |
-
|
34 |
-
|
35 |
-
<div style="display: flex; justify-content: center; align-items: center; text-align: center;">
|
36 |
-
<a href="https://arxiv.org/abs/2411.05007">[Paper]</a> 
|
37 |
-
<a href='https://github.com/mit-han-lab/nunchaku'>[Code]</a> 
|
38 |
-
<a href='https://svdquant.mit.edu'>[Demo]</a> 
|
39 |
-
<a href='https://hanlab.mit.edu/projects/svdquant'>[Website]</a> 
|
40 |
-
<a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
|
41 |
-
</div>
|
42 |
-
|
43 |
-

|
44 |
-
`svdq-int4-flux.1-canny-dev` is an INT4-quantized version of [`FLUX.1-Canny-dev`](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev), which can generate an image based on a text description while following the Canny edge of a given input image. It offers approximately 4× memory savings while also running 2–3× faster than the original BF16 model.
|
45 |
-
|
46 |
-
## Method
|
47 |
-
#### Quantization Method -- SVDQuant
|
48 |
-
|
49 |
-

|
50 |
-
Overview of SVDQuant. Stage1: Originally, both the activation ***X*** and weights ***W*** contain outliers, making 4-bit quantization challenging. Stage 2: We migrate the outliers from activations to weights, resulting in the updated activation and weight. While the activation becomes easier to quantize, the weight now becomes more difficult. Stage 3: SVDQuant further decomposes the weight into a low-rank component and a residual with SVD. Thus, the quantization difficulty is alleviated by the low-rank branch, which runs at 16-bit precision.
|
51 |
-
|
52 |
-
#### Nunchaku Engine Design
|
53 |
|
54 |
-
|
55 |
-
|
56 |
-
## Model Description
|
57 |
-
|
58 |
-
- **Developed by:** MIT, NVIDIA, CMU, Princeton, UC Berkeley, SJTU and Pika Labs
|
59 |
-
- **Model type:** INT W4A4 model
|
60 |
-
- **Model size:** 6.64GB
|
61 |
-
- **Model resolution:** The number of pixels need to be a multiple of 65,536.
|
62 |
-
- **License:** Apache-2.0
|
63 |
-
|
64 |
-
## Usage
|
65 |
-
|
66 |
-
### Diffusers
|
67 |
-
|
68 |
-
Please follow the instructions in [mit-han-lab/nunchaku](https://github.com/mit-han-lab/nunchaku) to set up the environment. Also, install some ControlNet dependencies:
|
69 |
-
|
70 |
-
```shell
|
71 |
-
pip install git+https://github.com/asomoza/image_gen_aux.git
|
72 |
-
pip install controlnet_aux mediapipe
|
73 |
-
```
|
74 |
-
|
75 |
-
Then you can run the model with
|
76 |
-
|
77 |
-
```python
|
78 |
-
import torch
|
79 |
-
from controlnet_aux import CannyDetector
|
80 |
-
from diffusers import FluxControlPipeline
|
81 |
-
from diffusers.utils import load_image
|
82 |
-
|
83 |
-
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
|
84 |
-
|
85 |
-
transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-int4-flux.1-canny-dev")
|
86 |
-
pipe = FluxControlPipeline.from_pretrained(
|
87 |
-
"black-forest-labs/FLUX.1-Canny-dev", transformer=transformer, torch_dtype=torch.bfloat16
|
88 |
-
).to("cuda")
|
89 |
-
|
90 |
-
prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
|
91 |
-
control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")
|
92 |
-
|
93 |
-
processor = CannyDetector()
|
94 |
-
control_image = processor(
|
95 |
-
control_image, low_threshold=50, high_threshold=200, detect_resolution=1024, image_resolution=1024
|
96 |
-
)
|
97 |
-
|
98 |
-
image = pipe(
|
99 |
-
prompt=prompt, control_image=control_image, height=1024, width=1024, num_inference_steps=50, guidance_scale=30.0
|
100 |
-
).images[0]
|
101 |
-
image.save("flux.1-canny-dev.png")
|
102 |
-
```
|
103 |
-
|
104 |
-
### Comfy UI
|
105 |
-
|
106 |
-
Work in progress. Stay tuned!
|
107 |
-
|
108 |
-
## Limitations
|
109 |
-
|
110 |
-
- The model is only runnable on NVIDIA GPUs with architectures sm_86 (Ampere: RTX 3090, A6000), sm_89 (Ada: RTX 4090), and sm_80 (A100). See this [issue](https://github.com/mit-han-lab/nunchaku/issues/1) for more details.
|
111 |
-
- You may observe some slight differences from the BF16 models in detail.
|
112 |
-
|
113 |
-
### Citation
|
114 |
|
115 |
-
|
116 |
|
117 |
```bibtex
|
118 |
@inproceedings{
|
@@ -122,4 +32,8 @@ If you find this model useful or relevant to your research, please cite
|
|
122 |
booktitle={The Thirteenth International Conference on Learning Representations},
|
123 |
year={2025}
|
124 |
}
|
125 |
-
```
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
base_model: black-forest-labs/FLUX.1-Canny-dev
|
3 |
+
base_model_relation: quantized
|
4 |
+
datasets:
|
5 |
+
- mit-han-lab/svdquant-datasets
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
library_name: diffusers
|
9 |
license: other
|
10 |
+
license_link: https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev/blob/main/LICENSE.md
|
11 |
license_name: flux-1-dev-non-commercial-license
|
12 |
+
pipeline_tag: image-to-image
|
13 |
tags:
|
14 |
- image-to-image
|
15 |
- SVDQuant
|
16 |
+
- FLUX.1-Canny-dev
|
17 |
- FLUX.1
|
18 |
- Diffusion
|
19 |
- Quantization
|
|
|
|
|
|
|
|
|
20 |
- ICLR2025
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
+
---
|
23 |
+
**This repository has been deprecated and will be hidden in December 2025. Please use https://huggingface.co/nunchaku-tech/nunchaku-flux.1-canny-dev.**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
+
## Citation
|
26 |
|
27 |
```bibtex
|
28 |
@inproceedings{
|
|
|
32 |
booktitle={The Thirteenth International Conference on Learning Representations},
|
33 |
year={2025}
|
34 |
}
|
35 |
+
```
|
36 |
+
|
37 |
+
## Attribution Notice
|
38 |
+
|
39 |
+
The FLUX.1 [dev] Model is licensed by Black Forest Labs Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.
|