Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.42.0
title: 'PicMatch: Your Visual Search Companion'
emoji: π·π
colorFrom: blue
colorTo: green
sdk: gradio
python_version: 3.9
sdk_version: 4.39.0
suggested_hardware: t4-small
suggested_storage: medium
app_file: app.py
short_description: Search images using text or other images as queries.
models:
- wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M
- Salesforce/blip-image-captioning-base
pinned: false
license: mit
πΈ PicMatch: Your Visual Search Companion π
PicMatch lets you effortlessly search through your image archive using either a text description or another image as your query. Find those needle-in-a-haystack photos in a flash! β¨
π Getting Started: Let the Fun Begin!
Prerequisites: Ensure you have Python 3.9 or higher installed on your system. π
Create a Virtual Environment:
python -m venv env
Activate the Environment:
source ./venv/bin/activate
Install Dependencies:
python -m pip install -r requirements.txt
Start the App (with Sample Data):
python app.py
Open Your Browser: Head to
localhost:7860
to access the PicMatch interface. π
π Data: Organize Your Visual Treasures
Make sure you have the following folders in your project's root directory:
data
βββ images
βββ features
π οΈ Image Pipeline: Download & Process with Speed β‘
The engine/download_data.py
Python script streamlines downloading and processing images from a list of URLs. It's designed for performance and reliability:
- Async Operations: Uses
asyncio
for concurrent image downloading and processing. β© - Rate Limiting: Follows API usage rules to prevent blocks with a
RateLimiter
. π¦ - Parallel Resizing: Employs a
ProcessPoolExecutor
for fast image resizing. βοΈ - State Management: Saves progress in a JSON file so you can resume later. πΎ
ποΈ Key Components:
ImagePipeline
Class: Manages the entire pipeline, its state, and rate limiting. ποΈ- Functions: Handle URL feeding (
url_feeder
), downloading (image_downloader
), and processing (image_processor
). π₯ ImageSaver
Class: Defines how images are processed and saved. πΌοΈresize_image
Function: Ensures image resizing maintains the correct aspect ratio. π
π How it Works:
- Start: Configure the pipeline with your URL list, download limits, and rate settings.
- Feed URLs: Asynchronously read URLs from your file.
- Download: Download images concurrently while respecting rate limits.
- Process: Save the original images and resize them in parallel.
- Save State: Regularly save progress to avoid starting over if interrupted.
To get the sample data run the command
cd engine && python download_data.py
β¨ Feature Creation: Making Your Images Searchable β¨
This step prepares your images for searching. We generate two types of embeddings:
- Visual Embeddings (CLIP): Capture the visual content of your images. ποΈβπ¨οΈ
- Textual Embeddings: Create embeddings from image captions for text-based search. π¬
To generate these features run the command
cd engine && python generate_features.py
This process uses these awesome models from Hugging Face:
- TinyCLIP:
wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M
- BLIP Image Captioning:
Salesforce/blip-image-captioning-base
- SentenceTransformer:
all-MiniLM-L6-v2
β‘ Asynchronous Feature Extraction: Supercharge Your Process β‘
This script extracts image features (both visual and textual) efficiently:
- Asynchronous: Loads images, extracts features, and saves them concurrently. β‘
- Dual Embeddings: Creates both CLIP (visual) and caption (textual) embeddings. πΌοΈπ
- Checkpoints: Keeps track of progress and allows resuming from interruptions. π
- Parallel: Uses multiple CPU cores for feature extraction. βοΈ
π Vector Database Module: Milvus for Fast Search π€
This module connects to the Milvus vector database to store and search your image embeddings:
- Milvus: A high-performance database built for handling vector data. π
- Easy Interface: Provides a simple way to manage embeddings and perform searches. π
- Single Server: Ensures only one Milvus server is running for efficiency.
- Indexing: Automatically creates an index to speed up your searches. π
- Similarity Search: Find the most similar images using cosine similarity. π―
π References: The Brains Behind PicMatch π§
PicMatch leverages these incredible open-source projects:
TinyCLIP: The visual powerhouse for understanding your images.
Image Captioning: The wordsmith that describes your photos in detail.
Sentence Transformers: Turns captions into embeddings for text-based search.
- π https://sbert.net
Unsplash: Images used were taken from Unsplash's open source data
Let's give credit where credit is due! π These projects make PicMatch smarter and more capable.