flan-t5-xxl-fused / README.md

Update README.md

0c14c49 verified 7 days ago

4.89 kB

	---
	license: apache-2.0
	tags:
	- T5xxl
	- Google FLAN
	---

	# FLAN-T5-XXL Fused Model

	Guide (External Site): [English](https://www.ai-image-journey.com/2025/03/flan-t5xxl-te-only.html) \| [Japanese](https://note.com/ai_image_journey/n/ncc6b1c475d8f)

	This repository hosts a fused version of the FLAN-T5-XXL model, created by combining the split files from [Google's FLAN-T5-XXL repository](https://huggingface.co/google/flan-t5-xxl). The files have been merged for convenience, making it easier to integrate into AI applications, including image generation workflows.

	<div style="display: flex; justify-content: center; align-items: center; gap: 2em;">
	<div>
	<img src="./images/flan_t5_xxl_TE-only_FP32_sample1.png" alt="FLAN-T5-XXL sample image 1" width="400px" height="400px">
	</div>
	<div>
	<img src="./images/flan_t5_xxl_TE-only_FP32_sample2.png" alt="FLAN-T5-XXL sample image 2" width="400px" height="400px">
	</div>
	</div>

	Base Model: [blue_pencil-flux1_v0.0.1](https://huggingface.co/bluepen5805/blue_pencil-flux1)

	## Key Features

	- Fused for Simplicity: Combines split model files into a single, ready-to-use format.
	- Optimized Variants: Available in FP32, FP16, and quantized GGUF formats to balance accuracy and resource usage.
	- Enhanced Prompt Accuracy: Outperforms the standard T5-XXL v1.1 in generating precise outputs for image generation tasks.

	## Model Variants

	\| Model \| Size \| SSIM Similality \| Reccomend \|
	\| :---: \| :---: \| :---: \| :---: \|
	\| FP32 \| 19 GB \| 100.0 % \| 🔺 \|
	\| FP16 \| 9.6 GB \| 98.0 % \| ✅ \|
	\| FP8 \| 4.8 GB \| 95.3 % \| 🔺 \|
	\| Q8_0 \| 6 GB \| 97.6 % \| ✅ \|
	\| Q6_K \| 4.9 GB \| 97.3 % \| 🔺 \|
	\| Q5_K_M \| 4.3 GB \| 94.8 % \| \|
	\| Q4_K_M \| 3.7 GB \| 96.4 % \| \|

	### Comparison Graph

	<div style="text-align: center; margin-left: auto; margin-right: auto; width: 600px; max-width: 80%;">
	<img src="./images/Flan-T5xxl_TE-only_MAE_SSIM_Similarity.png" alt="Flan-T5xxl MAE and SSIM Similarity Graph">
	</div>

	For a detailed comparison, refer to [this blog post](https://www.ai-image-journey.com/2024/12/image-difference-t5xxl-clip-l.html).

	## Usage Instructions

	Place the downloaded model files in one of the following directories:
	- `installation_folder/models/text_encoder`
	- `installation_folder/models/clip`
	- `installation_folder/Models/CLIP`

	### Stable Diffusion WebUI Forge

	In Stable Diffusion WebUI Forge, select the FLAN-T5-XXL model instead of the default T5xxl_v1_1 text encoder.

	<div style="text-align: center; margin-left: auto; margin-right: auto; width: 800px; max-width: 80%;">
	<img src="./images/Screenshot of Stable Diffusion WebUI Forge text encoder selection screen.png" alt="Stable Diffusion WebUI Forge Text Encoder Selection Screen">
	</div>

	Note: Stable Diffusion WebUI Forge does not support FP32 models. Use FP16 or GGUF formats instead.

	### ComfyUI

	Sample Workflow

	<div style="text-align: center; margin-left: auto; margin-right: auto; width: 1200px; max-width: 90%;">
	<img src="./images/Flux1_MultiGPU 2025.3.20.png" alt="Flux1_MultiGPU sample workflow">
	</div>

	<p style="text-align:center;font-size:0.8em;">This PNG image contains a ComfyUI workflow.</p>

	For ComfyUI, we recommend using the [ComfyUI-MultiGPU](https://github.com/neuratech-ai/ComfyUI-MultiGPU) custom node to load the model into system RAM instead of VRAM.

	<div style="text-align: center; margin-left: auto; margin-right: auto; width: 800px; max-width: 80%;">
	<img src="./images/Screenshots of ComfyUI's DualCLIPLoaderMultiGPU and DualCLIPLoaderGGUFMultiGPU custom nodes.png" alt="ComfyUI DualCLIPLoaderMultiGPU and DualCLIPLoaderGGUFMultiGPU Custom Nodes">
	</div>

	Use the DualCLIPLoaderMultiGPU or DualCLIPLoaderGGUFMultiGPU node and set the device to cpu to load the model into system RAM.

	FP32 Support: To use FP32 text encoders in ComfyUI, launch with the `--fp32-text-enc` flag.

	## Comparison: FLAN-T5-XXL vs T5-XXL v1.1

	<div style="display: flex; justify-content: center; align-items: center; gap: 2em;">
	<div>
	<img src="./images/flan_t5_xxl_image.png" alt="FLAN-T5-XXL Image" width="400px" height="400px">
	</div>
	<div>
	<img src="./images/t5_xxl_v1_1_image.png" alt="T5-XXL v1.1 Image" width="400px" height="400px">
	</div>
	</div>

	These example images were generated using FLAN-T5-XXL and [T5-XXL v1.1](https://huggingface.co/google/t5-v1_1-xxl) models in Flux.1. FLAN-T5-XXL delivers more accurate responses to prompts.

	## Further Comparisons

	- [FLAN-T5-XXL vs T5-XXL v1.1](https://www.ai-image-journey.com/2024/12/clip-t5xxl-text-encoder.html)
	- [FLAN-T5-XXL FP32 vs FP16 and Quantization](https://www.ai-image-journey.com/2024/12/image-difference-t5xxl-clip-l.html)

	---

	## License

	- This model is distributed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
	- The uploader claims no ownership or rights over the model.