Bokeh 3.5 Medium

Bokeh 3.5 Medium is based on Stable Diffusion 3.5 Medium as its foundation model, using a 5M high-resolution open-source dataset that underwent rigorous quality and aesthetic screening for post-training, ensuring excellent image quality, high fidelity of natural images, preservation of fine details, and enhanced controllability.
This model is released under the Stability Community License. For more details, visit Tensor.Art or TusiArt to explore additional resources and useful information.
Overview
- Continued training on SD3.5M, utilizing carefully curated high-resolution training data to achieve excellent image quality.
- Trained with mixed short/long natural language captions.
- Short Captions: Focus on the core subject content of the image.
- Long Captions: Provide broader descriptions of the scene environment and atmosphere.
- Recommended Resolutions:
1920x1024
,1728x1152
,1152x1728
,1280x1664
,1440x1440
- Powerful customized fine-tuning performance that can be widely used for downstream production tasks.
- Powerful customized fine-tuning performance that can be widely used for downstream production tasks.
- Achieve 8~10step image generation through strong distillation technology, with high-resolution images generated in just 5 seconds on a 3090-level GPU with some quality loss. You can use the 8steps lora with the base checkpoint or use the 8step checkpoint.
Advantages
🖼️ High-Quality Image Generation
- State-of-the-art visual fidelity with improved detail extraction and aesthetic consistency.
- Enhanced resolution support up to 200W pixels, ensuring highly detailed image outputs.
- Carefully curated dataset ensures better composition, lighting, and overall artistic appeal.
🎯 Powerful Custom Fine-Tuning
- Exceptional LoRA training support, making it highly effective for:
- Photography
- 3D Rendering
- Illustration
- Concept Art
⚡ Efficient Inference & Training
- Low hardware requirements for inference:
- Medium model: 9GB VRAM (without T5)
- Full weights inference: 16GB VRAM (suitable for local deployment)
- LoRA fine-tuning VRAM requirement: 12GB - 32GB
Known Issues
- Potential human anatomy inconsistencies.
- Limited ability to generate photorealistic images.
- Some concepts may suffer from aesthetic quality issues.
Prompting Guide
Use a structured prompt combining:
- Main subject (e.g.,
"Close-up of a macaw"
) - Detailed features (e.g.,
"vivid feathers, sharp beak"
) - Background environment (e.g.,
"dimly lit environment"
) - Atmospheric description (e.g.,
"soft warm lighting, cinematic mood"
) - Optimal token length: 30-70 tokens.
Example Output
Using diffusers:
import torch
from diffusers import StableDiffusion3Pipeline
pipe = StableDiffusion3Pipeline.from_pretrained("tensorart/bokeh_3.5_medium", torch_dtype=torch.bfloat16)
pipe = pipe.to("cuda")
image = pipe(
"Close-up of a macaw, dimly lit environment",
num_inference_steps=28,
guidance_scale=4,
height=1920,
width=1024,
negative_prompt="anime,cartoon,bad hands,extra finger,blurred,text,watermark",
negative_prompt_3=""
).images[0]
image.save("macaw.jpg")
Using comfyui: To use this workflow in ComfyUI, download the JSON file and load it:
🔧 Training Tools
- Kohya_ss: GitHub Repository
- Simple Tuner: GitHub Repository
Contact
- Website: https://tensor.art https://tusiart.com
- Developed by: TensorArt
- Downloads last month
- 240
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support