LoRACaptioner / README.md
Rishi Desai
added in images
fbe2bb9
|
raw
history blame
6.2 kB
metadata
title: LoRACaptioner
emoji: 🤠
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.25.2
app_file: demo.py
pinned: false

LoRACaptioner

  • Image Captioning: Automatically generate detailed and structured captions for your LoRA dataset.
  • Prompt Optimization: Enhance prompts during inference to achieve high-quality outputs.
Sukuna example 4 Sukuna example 5 Sukuna example 6 Sukuna example 7

Installation

Prerequisites

Setup

  1. Create the virtual environment:

    python -m venv venv
    source venv/bin/activate
    python -m pip install -r requirements.txt
    
  2. Run inference on one set of images:

    python main.py --input examples/ --output output/
    
    Arguments
    • --input (str): Directory containing images to caption.
    • --output (str): Directory to save images and captions (defaults to input directory).
    • --batch_images (flag): Caption images in batches by category.

Gradio Web Interface

Launch a user-friendly web interface for captioning and prompt optimization:

python demo.py

Notes

  • Images are processed individually in standard mode
  • For large collections, batch processing by category is recommended
  • Each caption is saved as a .txt file with the same name as the image

Troubleshooting

  • API errors: Ensure your Together API key is set and has funds
  • Image formats: Only .png, .jpg, .jpeg, and .webp files are supported

Examples

User Prompt:

holding a bow and arrow in a dense forest

<h5>Optimized Prompt:</h5>
<p class="optimized-prompt">tr1gg3r anime-style, pink spiky hair and black markings on face, shirtless with dark arm bands, holding bow and arrow, focused expression, dense forest, soft dappled lighting, three-quarter view</p>

<div class="example-image">
  <img src="/spaces/rdesai2/LoRACaptioner/resolve/main/examples/sukuna_1.png" alt="Sukuna with bow and arrow">
</div>
User Prompt:

drinking coffee in a san francisco cafe, white cloak, side view

<h5>Optimized Prompt:</h5>
<p class="optimized-prompt">tr1gg3r anime-style, spiky pink hair and facial markings, white cloak, sitting with cup in hand, neutral expression, cafe interior with san francisco view, soft natural lighting, side profile</p>

<div class="example-image">
  <img src="/spaces/rdesai2/LoRACaptioner/resolve/main/examples/sukuna_2.png" alt="Sukuna drinking coffee">
</div>
User Prompt:

playing pick-up basketball on a sunny day

<h5>Optimized Prompt:</h5>
<p class="optimized-prompt">tr1gg3r photorealistic, athletic build, sleeveless basketball jersey and shorts, jumping with ball, focused expression, outdoor basketball court with spectators, bright sunlight, low-angle view</p>

<div class="example-image">
  <img src="/spaces/rdesai2/LoRACaptioner/resolve/main/examples/sukuna_3.png" alt="Sukuna playing basketball">
</div>
User Prompt:

riding a horse on a prairie during sunset

<h5>Optimized Prompt:</h5>
<p class="optimized-prompt">tr1gger photorealistic, curly shoulder-length hair, floral button-up shirt, riding a horse, neutral expression, prairie during sunset, warm directional lighting, three-quarter view</p>

<div class="example-image">
  <img src="/spaces/rdesai2/LoRACaptioner/resolve/main/examples/woman_1.png" alt="Woman riding a horse">
</div>
User Prompt:

painting on a canvas in an art studio, side-view

<h5>Optimized Prompt:</h5>
<p class="optimized-prompt">tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing at an angle with brush in hand, neutral expression, art studio with canvas and paints, soft natural lighting, right side profile</p>

<div class="example-image">
  <img src="/spaces/rdesai2/LoRACaptioner/resolve/main/examples/woman_2.png" alt="Woman painting in studio">
</div>
User Prompt:

standing on a skyscraper in a dense city, dramatic stormy lighting, rear view

<h5>Optimized Prompt:</h5>
<p class="optimized-prompt">tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing upright, neutral expression, skyscraper rooftop in dense city, dramatic stormy lighting, back view</p>

<div class="example-image">
  <img src="/spaces/rdesai2/LoRACaptioner/resolve/main/examples/woman_3.png" alt="Woman on skyscraper">
</div>

License

MIT License