Bhaskar009
/

SD_1.5_LoRA

stable-diffusion

stable-diffusion-diffusers

diffusers-training

Model card Files Files and versions Community

SD_1.5_LoRA / README.md

Bhaskar009's picture

Update README.md

ed45378 verified about 1 month ago

|

history blame contribute delete

3.25 kB

	---
	base_model: stable-diffusion-v1-5/stable-diffusion-v1-5
	library_name: diffusers
	license: creativeml-openrail-m
	inference: true
	tags:
	- stable-diffusion
	- stable-diffusion-diffusers
	- text-to-image
	- diffusers
	- diffusers-training
	- lora
	datasets:
	- lambdalabs/naruto-blip-captions
	---

	<!-- This model card has been generated automatically according to the information the training script had access to. You
	should probably proofread and complete it, then remove this comment. -->


	# LoRA text2image fine-tuning - Bhaskar009/SD_1.5_LoRA
	These are LoRA adaption weights for stable-diffusion-v1-5/stable-diffusion-v1-5. The weights were fine-tuned on the lambdalabs/naruto-blip-captions dataset. You can find some example images in the following.

	![img_0](./image_0.png)
	![img_1](./image_1.png)
	![img_2](./image_2.png)
	![img_3](./image_3.png)



	## Intended uses & limitations

	#### How to use

	```python
	import torch
	import matplotlib.pyplot as plt
	from diffusers import DiffusionPipeline

	# Load the model and move it to GPU (CUDA)
	pipe = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5").to("cuda")

	# Load the fine-tuned LoRA weights
	pipe.load_lora_weights("Bhaskar009/SD_1.5_LoRA")

	# moving to cuda
	pipe.to("cuda")

	# Define a Naruto-themed prompt
	prompt = "A detailed anime-style portrait of Naruto Uzumaki, wearing his Hokage cloak, standing under a bright sunset, ultra-detailed, cinematic lighting, 8K"

	# Generate the image
	image = pipe(prompt).images[0]

	# Display the image using matplotlib
	plt.figure(figsize=(6, 6))
	plt.imshow(image)
	plt.axis("off") # Hide axes for a clean view
	plt.show()

	```

	#### Limitations and bias

	[TODO: provide examples of latent issues and potential remediations]

	## Training details - Stable Diffusion LoRA

	# Dataset

	-The model was trained using the 'lambdalabs/naruto-blip-captions' dataset.
	-This dataset consists of Naruto character images with BLIP-generated captions.
	-It provides a diverse set of characters, poses, and backgrounds,
	-making it suitable for fine-tuning Stable Diffusion on anime-style images.

	# Model

	-Base Model: Stable Diffusion v1.5 (stable-diffusion-v1-5/stable-diffusion-v1-5)
	-Fine-tuning Method: LoRA (Low-Rank Adaptation)
	-Purpose: Specializing Stable Diffusion to generate Naruto-style anime characters.

	# Preprocessing

	- Images were resized to 512x512 resolution.
	- Center cropping was applied to maintain aspect ratio.
	- Random flipping was used as a data augmentation technique.

	# Training Configuration

	-Batch Size: 1
	-Gradient Accumulation Steps: 4 # Simulates a larger batch size
	-Gradient Checkpointing: Enabled # Reduces memory consumption
	-Max Training Steps: 800
	-Learning Rate: 1e-5 (constant schedule, no warmup)
	-Max Gradient Norm: 1 # Prevents gradient explosion
	-Memory Optimization: xFormers enabled for efficient attention computation

	# Validation

	- A validation prompt "A Naruto character" was used.
	- 4 validation images were generated during training.
	- Model checkpoints were saved every 500 steps.

	# Model Output

	- The fine-tuned LoRA model was saved to "sd-naruto-model".
	- The model was pushed to the Hugging Face Hub:
	- Repository: Bhaskar009/SD_1.5_LoRA