Update README.md
Browse files
README.md
CHANGED
@@ -1,84 +1,113 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
-
- T5xxl
|
5 |
-
- Google FLAN
|
6 |
---
|
7 |
|
8 |
# FLAN-T5-XXL Fused Model
|
9 |
|
10 |
-
This repository
|
11 |
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
Two additional files provide the **Text Encoder (TE) only** portion of FLAN-T5-XXL, specifically extracted for use with Stable Diffusion WebUI Forge and ComfyUI.
|
16 |
-
|
17 |
-
<div style="display: flex; justify-content: center; align-items: center;">
|
18 |
-
<div style="text-align: center; margin-right: 1em;">
|
19 |
-
<img src="./flan_t5_xxl_TE-only_FP32_sample1.png" alt="flan_t5_xxl_TE-only_FP32_sample1" width="400px" height="400px">
|
20 |
</div>
|
21 |
-
<div
|
22 |
-
<img src="./flan_t5_xxl_TE-only_FP32_sample2.png" alt="
|
23 |
</div>
|
24 |
</div>
|
25 |
|
26 |
-
|
27 |
-
- `flan_t5_xxl_TE-only_FP16.safetensors`: Half-precision FP16 TE-only model for memory-efficient inference.
|
28 |
|
29 |
-
|
30 |
|
31 |
-
|
|
|
|
|
32 |
|
|
|
33 |
|
34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
|
36 |
-
|
37 |
-
- `flan_t5_xxl_fp32.safetensors`: Full-precision FP32 Full model.
|
38 |
-
- `flan_t5_xxl_fp16.safetensors`: Half-precision FP16 Full model for memory-efficient inference.
|
39 |
|
40 |
-
|
|
|
|
|
41 |
|
42 |
-
|
43 |
|
44 |
-
|
45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
</div>
|
47 |
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
## Comparison: FLAN-T5-XXL vs T5-XXL v1.1
|
51 |
|
52 |
-
<div style="display: flex; justify-content: center; align-items: center;">
|
53 |
-
<div
|
54 |
-
<img src="./flan_t5_xxl_image.png" alt="FLAN-T5-XXL Image" width="400px" height="400px">
|
55 |
-
<p>FLAN-T5-XXL Output</p>
|
56 |
</div>
|
57 |
-
<div
|
58 |
-
<img src="./t5_xxl_v1_1_image.png" alt="T5-XXL v1.1 Image" width="400px" height="400px">
|
59 |
-
<p>T5-XXL v1.1 Output</p>
|
60 |
</div>
|
61 |
</div>
|
62 |
|
63 |
-
These example images generated using **FLAN-T5-XXL** and [**T5-XXL v1.1**](https://huggingface.co/google/t5-v1_1-xxl) models in Flux.1.
|
64 |
|
65 |
-
|
66 |
|
67 |
-
|
|
|
|
|
|
|
68 |
|
69 |
-
|
|
|
|
|
70 |
|
71 |
-
|
72 |
-
- [FLAN-T5-XXL FP32 vs FP16 and other quantization](https://www.ai-image-journey.com/2024/12/image-difference-t5xxl-clip-l.html)
|
73 |
|
74 |
---
|
75 |
|
76 |
## License
|
77 |
-
This model is provided under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
|
78 |
-
The uploader does not claim any rights over the model.
|
79 |
-
|
80 |
-
---
|
81 |
|
82 |
-
|
83 |
-
-
|
84 |
-
- GGUF version: [dumb-dev's Hugging Face repository](https://huggingface.co/dumb-dev/flan-t5-xxl-gguf).
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
+
- T5xxl
|
5 |
+
- Google FLAN
|
6 |
---
|
7 |
|
8 |
# FLAN-T5-XXL Fused Model
|
9 |
|
10 |
+
This repository hosts a fused version of the FLAN-T5-XXL model, created by combining the split files from [Google's FLAN-T5-XXL repository](https://huggingface.co/google/flan-t5-xxl). The files have been merged for convenience, making it easier to integrate into AI applications, including image generation workflows.
|
11 |
|
12 |
+
<div style="display: flex; justify-content: center; align-items: center; gap: 2em;">
|
13 |
+
<div>
|
14 |
+
<img src="./images/flan_t5_xxl_TE-only_FP32_sample1.png" alt="FLAN-T5-XXL sample image 1" width="400px" height="400px">
|
|
|
|
|
|
|
|
|
|
|
15 |
</div>
|
16 |
+
<div>
|
17 |
+
<img src="./images/flan_t5_xxl_TE-only_FP32_sample2.png" alt="FLAN-T5-XXL sample image 2" width="400px" height="400px">
|
18 |
</div>
|
19 |
</div>
|
20 |
|
21 |
+
Sample pictures: Base Model [**blue_pencil-flux1_v0.0.1**](https://huggingface.co/bluepen5805/blue_pencil-flux1)
|
|
|
22 |
|
23 |
+
## Key Features
|
24 |
|
25 |
+
- **Fused for Simplicity:** Combines split model files into a single, ready-to-use format.
|
26 |
+
- **Optimized Variants:** Available in FP32, FP16, and quantized GGUF formats to balance accuracy and resource usage.
|
27 |
+
- **Enhanced Prompt Accuracy:** Outperforms the standard T5-XXL v1.1 in generating precise outputs for image generation tasks.
|
28 |
|
29 |
+
## Model Variants
|
30 |
|
31 |
+
| Model File | Size | Accuracy (SSIM Similarity) | Recommended |
|
32 |
+
|-------------------------------------|:--------:|:----------------------------:|:-------------:|
|
33 |
+
| flan_t5_xxl_fp32.safetensors | 44.1GB | 100% | |
|
34 |
+
| flan_t5_xxl_fp16.safetensors | 22.1GB | 99.9% | |
|
35 |
+
| flan_t5_xxl_TE-only_FP32.safetensors| 18.7GB | 100% | 🔺 |
|
36 |
+
| flan_t5_xxl_TE-only_FP16.safetensors| 9.4GB | 99.9% | ✅ |
|
37 |
+
| flan_t5_xxl_TE-only_Q8_0.gguf | 5.5GB | 99.8% | ✅ |
|
38 |
+
| flan_t5_xxl_TE-only_Q6_K.gguf | 4.4GB | 99.7% | 🔺 |
|
39 |
+
| flan_t5_xxl_TE-only_Q5_K_M.gguf | 3.8GB | 98.4% | 🔺 |
|
40 |
+
| flan_t5_xxl_TE-only_Q4_K_M.gguf | 3.2GB | 95.2% | |
|
41 |
+
| flan_t5_xxl_TE-only_Q3_K_L.gguf | 2.6GB | 84.9% | |
|
42 |
|
43 |
+
### Comparison Graph
|
|
|
|
|
44 |
|
45 |
+
<div style="text-align: center; margin-left: auto; margin-right: auto; width: 600px; max-width: 80%;">
|
46 |
+
<img src="./images/Flan-T5xxl MAE and SSIM Similarity.png" alt="Flan-T5xxl MAE and SSIM Similarity Graph">
|
47 |
+
</div>
|
48 |
|
49 |
+
For a detailed comparison, refer to [this blog post](https://www.ai-image-journey.com/2024/12/image-difference-t5xxl-clip-l.html).
|
50 |
|
51 |
+
## Usage Instructions
|
52 |
+
|
53 |
+
Place the downloaded model files in one of the following directories:
|
54 |
+
- `installation_folder/models/text_encoder`
|
55 |
+
- `installation_folder/models/clip`
|
56 |
+
- `installation_folder/Models/CLIP`
|
57 |
+
|
58 |
+
### Stable Diffusion WebUI Forge
|
59 |
+
|
60 |
+
In Stable Diffusion WebUI Forge, select the FLAN-T5-XXL model instead of the default T5xxl_v1_1 text encoder.
|
61 |
+
|
62 |
+
<div style="text-align: center; margin-left: auto; margin-right: auto; width: 800px; max-width: 80%;">
|
63 |
+
<img src="./images/Screenshot of Stable Diffusion WebUI Forge text encoder selection screen.png" alt="Stable Diffusion WebUI Forge Text Encoder Selection Screen">
|
64 |
</div>
|
65 |
|
66 |
+
**Note:** Stable Diffusion WebUI Forge does not support FP32 models. Use FP16 or GGUF formats instead.
|
67 |
+
|
68 |
+
### ComfyUI
|
69 |
+
|
70 |
+
**Sample Workflow**
|
71 |
+
|
72 |
+
For ComfyUI, we recommend using the [ComfyUI-MultiGPU](https://github.com/neuratech-ai/ComfyUI-MultiGPU) custom node to load the model into system RAM instead of VRAM.
|
73 |
+
|
74 |
+
<div style="text-align: center; margin-left: auto; margin-right: auto; width: 800px; max-width: 80%;">
|
75 |
+
<img src="./images/Screenshots of ComfyUI's DualCLIPLoaderMultiGPU and DualCLIPLoaderGGUFMultiGPU custom nodes.png" alt="ComfyUI DualCLIPLoaderMultiGPU and DualCLIPLoaderGGUFMultiGPU Custom Nodes">
|
76 |
+
</div>
|
77 |
+
|
78 |
+
Use the **DualCLIPLoaderMultiGPU** or **DualCLIPLoaderGGUFMultiGPU** node and set the device to **cpu** to load the model into system RAM.
|
79 |
+
|
80 |
+
**FP32 Support:** To use FP32 text encoders in ComfyUI, launch with the `--fp32-text-enc` flag.
|
81 |
|
82 |
## Comparison: FLAN-T5-XXL vs T5-XXL v1.1
|
83 |
|
84 |
+
<div style="display: flex; justify-content: center; align-items: center; gap: 2em;">
|
85 |
+
<div>
|
86 |
+
<img src="./images/flan_t5_xxl_image.png" alt="FLAN-T5-XXL Image" width="400px" height="400px">
|
|
|
87 |
</div>
|
88 |
+
<div>
|
89 |
+
<img src="./images/t5_xxl_v1_1_image.png" alt="T5-XXL v1.1 Image" width="400px" height="400px">
|
|
|
90 |
</div>
|
91 |
</div>
|
92 |
|
93 |
+
These example images were generated using **FLAN-T5-XXL** and [**T5-XXL v1.1**](https://huggingface.co/google/t5-v1_1-xxl) models in Flux.1. FLAN-T5-XXL delivers more accurate responses to prompts.
|
94 |
|
95 |
+
## Further Comparisons
|
96 |
|
97 |
+
- [FLAN-T5-XXL vs T5-XXL v1.1](https://ai-image-journey.blogspot.com/2024/12/clip-t5xxl-text-encoder.html)
|
98 |
+
- [FLAN-T5-XXL FP32 vs FP16 and Quantization](https://ai-image-journey.blogspot.com/2024/12/image-difference-t5xxl-clip-l.html)
|
99 |
+
|
100 |
+
### Tip: Upgrade CLIP-L Too
|
101 |
|
102 |
+
For even better results, consider upgrading the CLIP-L text encoder alongside FLAN-T5-XXL:
|
103 |
+
- [LongCLIP-SAE-ViT-L-14](https://huggingface.co/zer0int/LongCLIP-SAE-ViT-L-14) (ComfyUI only)
|
104 |
+
- [CLIP-SAE-ViT-L-14](https://huggingface.co/zer0int/CLIP-SAE-ViT-L-14)
|
105 |
|
106 |
+
Combining FLAN-T5-XXL with an upgraded CLIP-L can further enhance image quality.
|
|
|
107 |
|
108 |
---
|
109 |
|
110 |
## License
|
|
|
|
|
|
|
|
|
111 |
|
112 |
+
- This model is distributed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
|
113 |
+
- The uploader claims no ownership or rights over the model.
|
|