SD_1.5_LoRA / README.md
Bhaskar009's picture
Update README.md
ed45378 verified
---
base_model: stable-diffusion-v1-5/stable-diffusion-v1-5
library_name: diffusers
license: creativeml-openrail-m
inference: true
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- diffusers-training
- lora
datasets:
- lambdalabs/naruto-blip-captions
---
<!-- This model card has been generated automatically according to the information the training script had access to. You
should probably proofread and complete it, then remove this comment. -->
# LoRA text2image fine-tuning - Bhaskar009/SD_1.5_LoRA
These are LoRA adaption weights for stable-diffusion-v1-5/stable-diffusion-v1-5. The weights were fine-tuned on the lambdalabs/naruto-blip-captions dataset. You can find some example images in the following.
![img_0](./image_0.png)
![img_1](./image_1.png)
![img_2](./image_2.png)
![img_3](./image_3.png)
## Intended uses & limitations
#### How to use
```python
import torch
import matplotlib.pyplot as plt
from diffusers import DiffusionPipeline
# Load the model and move it to GPU (CUDA)
pipe = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5").to("cuda")
# Load the fine-tuned LoRA weights
pipe.load_lora_weights("Bhaskar009/SD_1.5_LoRA")
# moving to cuda
pipe.to("cuda")
# Define a Naruto-themed prompt
prompt = "A detailed anime-style portrait of Naruto Uzumaki, wearing his Hokage cloak, standing under a bright sunset, ultra-detailed, cinematic lighting, 8K"
# Generate the image
image = pipe(prompt).images[0]
# Display the image using matplotlib
plt.figure(figsize=(6, 6))
plt.imshow(image)
plt.axis("off") # Hide axes for a clean view
plt.show()
```
#### Limitations and bias
[TODO: provide examples of latent issues and potential remediations]
## Training details - Stable Diffusion LoRA
# Dataset
-The model was trained using the 'lambdalabs/naruto-blip-captions' dataset.
-This dataset consists of Naruto character images with BLIP-generated captions.
-It provides a diverse set of characters, poses, and backgrounds,
-making it suitable for fine-tuning Stable Diffusion on anime-style images.
# Model
-Base Model: Stable Diffusion v1.5 (stable-diffusion-v1-5/stable-diffusion-v1-5)
-Fine-tuning Method: LoRA (Low-Rank Adaptation)
-Purpose: Specializing Stable Diffusion to generate Naruto-style anime characters.
# Preprocessing
- Images were resized to 512x512 resolution.
- Center cropping was applied to maintain aspect ratio.
- Random flipping was used as a data augmentation technique.
# Training Configuration
-Batch Size: 1
-Gradient Accumulation Steps: 4 # Simulates a larger batch size
-Gradient Checkpointing: Enabled # Reduces memory consumption
-Max Training Steps: 800
-Learning Rate: 1e-5 (constant schedule, no warmup)
-Max Gradient Norm: 1 # Prevents gradient explosion
-Memory Optimization: xFormers enabled for efficient attention computation
# Validation
- A validation prompt "A Naruto character" was used.
- 4 validation images were generated during training.
- Model checkpoints were saved every 500 steps.
# Model Output
- The fine-tuned LoRA model was saved to "sd-naruto-model".
- The model was pushed to the Hugging Face Hub:
- Repository: Bhaskar009/SD_1.5_LoRA