File size: 5,611 Bytes

1a8a914
 
 
fdbfc65
 
1a8a914
 
c78c922
 
2622de0
 
fdbfc65
c78c922
fdbfc65
 
 
1c5972e
fdbfc65
 
1c5972e
 
 
a26b04c
5e15925
fdbfc65
5e15925
fdbfc65
 
 
5e15925
fdbfc65
5e15925
fdbfc65
 
d7cfc67
 
fdbfc65
 
 
 
 
 
 
5e15925
fdbfc65
5e15925
fdbfc65
 
 
5e15925
fdbfc65
5e15925
fdbfc65
 
 
 
 
 
 
 
 
 
 
 
 
172827a
 
fdbfc65
 
 
 
7b042f7
 
891e515
7b042f7
 
 
0fbdba8
 
fdbfc65
 
 
 
 
 
 
 
 
f54a5bb
5e15925
f54a5bb
fdbfc65
 
 
172827a
fdbfc65
 
172827a
69ad9ef
f54a5bb
fdbfc65
0bff8d7
fdbfc65
0bff8d7
4fc7cb2
 
fdbfc65
f54a5bb
 
c78c922
f54a5bb
fdbfc65

---
license: apache-2.0
tags:
  - T5xxl
  - Google FLAN
---

# FLAN-T5-XXL Fused Model

**Guide (External Site):** [English](https://www.ai-image-journey.com/2025/03/flan-t5xxl-te-only.html) | [Japanese](https://note.com/ai_image_journey/n/ncc6b1c475d8f)

This repository hosts a fused version of the FLAN-T5-XXL model, created by combining the split files from [Google's FLAN-T5-XXL repository](https://huggingface.co/google/flan-t5-xxl). The files have been merged for convenience, making it easier to integrate into AI applications, including image generation workflows.

<div style="display: flex; justify-content: center; align-items: center; gap: 2em;">
  <div>
    <img src="./images/flan_t5_xxl_TE-only_FP32_sample1.png" alt="FLAN-T5-XXL sample image 1" width="400px" height="400px">
  </div>
  <div>
    <img src="./images/flan_t5_xxl_TE-only_FP32_sample2.png" alt="FLAN-T5-XXL sample image 2" width="400px" height="400px">
  </div>
</div>

Base Model: [**blue_pencil-flux1_v0.0.1**](https://huggingface.co/bluepen5805/blue_pencil-flux1)

## Key Features

- **Fused for Simplicity:** Combines split model files into a single, ready-to-use format.
- **Optimized Variants:** Available in FP32, FP16, and quantized GGUF formats to balance accuracy and resource usage.
- **Enhanced Prompt Accuracy:** Outperforms the standard T5-XXL v1.1 in generating precise outputs for image generation tasks.

## Model Variants

| Model File                          | Size   | Accuracy (SSIM Similarity) | Recommended |
|-------------------------------------|:--------:|:----------------------------:|:-------------:|
| flan_t5_xxl_full_fp32.safetensors       | 44.1GB | 100%                       |             |
| flan_t5_xxl_full_fp16.safetensors       | 22.1GB | 99.9%                      |             |
| flan_t5_xxl_TE-only_FP32.safetensors| 18.7GB | 100%                       | 🔺           |
| flan_t5_xxl_TE-only_FP16.safetensors| 9.4GB  | 99.9%                      | ✅          |
| flan_t5_xxl_TE-only_Q8_0.gguf      | 5.5GB  | 99.8%                      | ✅          |
| flan_t5_xxl_TE-only_Q6_K.gguf      | 4.4GB  | 99.7%                      | 🔺           |
| flan_t5_xxl_TE-only_Q5_K_M.gguf    | 3.8GB  | 98.4%                      | 🔺           |
| flan_t5_xxl_TE-only_Q4_K_M.gguf    | 3.2GB  | 95.2%                      |             |
| flan_t5_xxl_TE-only_Q3_K_L.gguf    | 2.6GB  | 84.9%                      |             |

### Comparison Graph

<div style="text-align: center; margin-left: auto; margin-right: auto; width: 600px; max-width: 80%;">
  <img src="./images/Flan-T5xxl MAE and SSIM Similarity.png" alt="Flan-T5xxl MAE and SSIM Similarity Graph">
</div>

For a detailed comparison, refer to [this blog post](https://www.ai-image-journey.com/2024/12/image-difference-t5xxl-clip-l.html).

## Usage Instructions

Place the downloaded model files in one of the following directories:
- `installation_folder/models/text_encoder`
- `installation_folder/models/clip`
- `installation_folder/Models/CLIP`

### Stable Diffusion WebUI Forge

In Stable Diffusion WebUI Forge, select the FLAN-T5-XXL model instead of the default T5xxl_v1_1 text encoder.

<div style="text-align: center; margin-left: auto; margin-right: auto; width: 800px; max-width: 80%;">
  <img src="./images/Screenshot of Stable Diffusion WebUI Forge text encoder selection screen.png" alt="Stable Diffusion WebUI Forge Text Encoder Selection Screen">
</div>

**Note:** Stable Diffusion WebUI Forge does not support FP32 models. Use FP16 or GGUF formats instead.

### ComfyUI

**Sample Workflow**

<div style="text-align: center; margin-left: auto; margin-right: auto; width: 1200px; max-width: 90%;">
  <img src="./images/Flux1_MultiGPU 2025.3.10.png" alt="Flux1_MultiGPU sample workflow">
</div>

<p style="text-align:center;font-size:0.8em;">This PNG image contains a ComfyUI workflow.</p>

For ComfyUI, we recommend using the [ComfyUI-MultiGPU](https://github.com/neuratech-ai/ComfyUI-MultiGPU) custom node to load the model into system RAM instead of VRAM.

<div style="text-align: center; margin-left: auto; margin-right: auto; width: 800px; max-width: 80%;">
  <img src="./images/Screenshots of ComfyUI's DualCLIPLoaderMultiGPU and DualCLIPLoaderGGUFMultiGPU custom nodes.png" alt="ComfyUI DualCLIPLoaderMultiGPU and DualCLIPLoaderGGUFMultiGPU Custom Nodes">
</div>

Use the **DualCLIPLoaderMultiGPU** or **DualCLIPLoaderGGUFMultiGPU** node and set the device to **cpu** to load the model into system RAM.

**FP32 Support:** To use FP32 text encoders in ComfyUI, launch with the `--fp32-text-enc` flag.

## Comparison: FLAN-T5-XXL vs T5-XXL v1.1

<div style="display: flex; justify-content: center; align-items: center; gap: 2em;">
  <div>
    <img src="./images/flan_t5_xxl_image.png" alt="FLAN-T5-XXL Image" width="400px" height="400px">
  </div>
  <div>
    <img src="./images/t5_xxl_v1_1_image.png" alt="T5-XXL v1.1 Image" width="400px" height="400px">
  </div>
</div>

These example images were generated using **FLAN-T5-XXL** and [**T5-XXL v1.1**](https://huggingface.co/google/t5-v1_1-xxl) models in Flux.1. FLAN-T5-XXL delivers more accurate responses to prompts.

## Further Comparisons

- [FLAN-T5-XXL vs T5-XXL v1.1](https://www.ai-image-journey.com/2024/12/clip-t5xxl-text-encoder.html)
- [FLAN-T5-XXL FP32 vs FP16 and Quantization](https://www.ai-image-journey.com/2024/12/image-difference-t5xxl-clip-l.html)

---

## License

- This model is distributed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
- The uploader claims no ownership or rights over the model.