trithemius commited on
Commit
c9ce2c3
·
1 Parent(s): 7a613d0

Update README and add new models

Browse files
README.md CHANGED
@@ -1,3 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
3
  license_name: flux-dev
 
1
+ # Flux.dev quantized versions
2
+
3
+ ## Quantized FLUX Transformer with Hyper-SD LoRA
4
+
5
+ This repository contains a pre-trained model file `flux-fp8-hyper8-transformers-lora.pt`, a quantized FLUX transformer model merged with Hyper-SD LoRA weights, optimized for efficient inference.
6
+
7
+ ### Model Details
8
+
9
+ - **Base Model**: FLUX.1-dev transformer from Black Forest Labs
10
+ - **LoRA**: Hyper-SD from ByteDance
11
+ - **Quantization**: FP8 (e5m2 format)
12
+ - **LoRA Scale**: 0.125
13
+
14
+ ### Technical Specifications
15
+
16
+ #### Quantization
17
+ - The model uses 8-bit floating-point (FP8) quantization with e5m2 format
18
+ - Implemented using the `optimum.quanto` library
19
+ - Weights are frozen after quantization for inference
20
+
21
+ #### Architecture
22
+ - Based on FluxTransformer2DModel
23
+ - Includes merged LoRA weights from Hyper-SD
24
+ - Optimized for 8-step inference
25
+
26
+ ### Model Creation Process
27
+
28
+ 1. **Base Model Loading**
29
+ - Loads FLUX.1-dev transformer in bfloat16 format
30
+ - Source: `black-forest-labs/FLUX.1-dev`
31
+
32
+ 2. **Quantization**
33
+ - Applies FP8 quantization using `qfloat8_e5m2`
34
+ - Reduces model size while maintaining performance
35
+
36
+ 3. **LoRA Integration**
37
+ - Loads Hyper-SD LoRA weights
38
+ - Merges with base model using 0.125 scale factor
39
+ - Source: `ByteDance/Hyper-SD`
40
+
41
+ 4. **Model Freezing**
42
+ - Freezes weights for efficient inference
43
+ - Saves as PyTorch model file
44
+
45
+ ### Usage
46
+
47
+ ```python
48
+ import torch
49
+
50
+ # Load the model
51
+ model = torch.load('flux-fp8-hyper8-transformers-lora.pt')
52
+
53
+ # Model is ready for inference
54
+ # Use with appropriate input formatting and processing
55
+ ```
56
+
57
+ ### Requirements
58
+
59
+ - PyTorch
60
+ - optimum.quanto
61
+ - diffusers
62
+ - huggingface_hub
63
+ - safetensors
64
+
65
+ ### References
66
+
67
+ - FLUX.1-dev: [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)
68
+ - Hyper-SD: [ByteDance/Hyper-SD](https://huggingface.co/ByteDance/Hyper-SD)
69
+
70
+ ### License
71
+
72
+ Please refer to the original FLUX.1-dev and Hyper-SD licenses for usage terms and conditions.
73
+
74
+ ## Acknowledgments
75
+
76
+ - [Black Forest Labs](https://huggingface.co/black-forest-labs) for the base FluxTransformer2DModel.
77
+ - [ByteDance](https://huggingface.co/ByteDance) for the LoRA weights.
78
+ - The developers of the `optimum.quanto` and `safetensors` libraries for their tools.
79
+
80
+ ```
81
+
82
+
83
  ---
84
  license: other
85
  license_name: flux-dev
flux-fp8-hyper8-transformers-lora.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e5fecda7972b175eec5f328a40bfdefa359e851b2dee8f7bfe4454a19848590d
3
+ size 11911837562
flux-fp8-transformers.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c72da99b854cd878170bf231c7cabe1288fb6eef8b3c1d1aa96cc581a31b39f4
3
+ size 11911805322