Spaces:
Sleeping
Sleeping
File size: 5,544 Bytes
767749c d1df841 767749c d1df841 767749c d1df841 767749c d1df841 d78994a d1df841 d78994a 767749c d1df841 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
---
title: 'PicMatch: Your Visual Search Companion'
emoji: π·π
colorFrom: blue
colorTo: green
sdk: gradio
python_version: 3.9
sdk_version: 4.39.0
suggested_hardware: t4-small
suggested_storage: medium
app_file: app.py
short_description: Search images using text or other images as queries.
models:
- wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M
- Salesforce/blip-image-captioning-base
pinned: false
license: mit
---
# πΈ PicMatch: Your Visual Search Companion π
PicMatch lets you effortlessly search through your image archive using either a text description or another image as your query. Find those needle-in-a-haystack photos in a flash! β¨
## π Getting Started: Let the Fun Begin!
1. **Prerequisites:** Ensure you have Python 3.9 or higher installed on your system. π
2. **Create a Virtual Environment:**
```bash
python -m venv env
```
3. **Activate the Environment:**
```bash
source ./venv/bin/activate
```
4. **Install Dependencies:**
```bash
python -m pip install -r requirements.txt
```
5. **Start the App (with Sample Data):**
```bash
python app.py
```
6. **Open Your Browser:** Head to `localhost:7860` to access the PicMatch interface. π
## π Data: Organize Your Visual Treasures
Make sure you have the following folders in your project's root directory:
```
data
βββ images
βββ features
```
## π οΈ Image Pipeline: Download & Process with Speed β‘
The `engine/download_data.py` Python script streamlines downloading and processing images from a list of URLs. It's designed for performance and reliability:
- **Async Operations:** Uses `asyncio` for concurrent image downloading and processing. β©
- **Rate Limiting:** Follows API usage rules to prevent blocks with a `RateLimiter`. π¦
- **Parallel Resizing:** Employs a `ProcessPoolExecutor` for fast image resizing. βοΈ
- **State Management:** Saves progress in a JSON file so you can resume later. πΎ
### ποΈ Key Components:
- **`ImagePipeline` Class:** Manages the entire pipeline, its state, and rate limiting. ποΈ
- **Functions:** Handle URL feeding (`url_feeder`), downloading (`image_downloader`), and processing (`image_processor`). π₯
- **`ImageSaver` Class:** Defines how images are processed and saved. πΌοΈ
- **`resize_image` Function:** Ensures image resizing maintains the correct aspect ratio. π
### π How it Works:
1. **Start:** Configure the pipeline with your URL list, download limits, and rate settings.
2. **Feed URLs:** Asynchronously read URLs from your file.
3. **Download:** Download images concurrently while respecting rate limits.
4. **Process:** Save the original images and resize them in parallel.
5. **Save State:** Regularly save progress to avoid starting over if interrupted.
To get the sample data run the command
```bash
cd engine && python download_data.py
```
## β¨ Feature Creation: Making Your Images Searchable β¨
This step prepares your images for searching. We generate two types of embeddings:
- **Visual Embeddings (CLIP):** Capture the visual content of your images. ποΈβπ¨οΈ
- **Textual Embeddings:** Create embeddings from image captions for text-based search. π¬
To generate these features run the command
```bash
cd engine && python generate_features.py
```
This process uses these awesome models from Hugging Face:
- TinyCLIP: `wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M`
- BLIP Image Captioning: `Salesforce/blip-image-captioning-base`
- SentenceTransformer: `all-MiniLM-L6-v2`
## β‘ Asynchronous Feature Extraction: Supercharge Your Process β‘
This script extracts image features (both visual and textual) efficiently:
- **Asynchronous:** Loads images, extracts features, and saves them concurrently. β‘
- **Dual Embeddings:** Creates both CLIP (visual) and caption (textual) embeddings. πΌοΈπ
- **Checkpoints:** Keeps track of progress and allows resuming from interruptions. π
- **Parallel:** Uses multiple CPU cores for feature extraction. βοΈ
## π Vector Database Module: Milvus for Fast Search π€
This module connects to the Milvus vector database to store and search your image embeddings:
- **Milvus:** A high-performance database built for handling vector data. π
- **Easy Interface:** Provides a simple way to manage embeddings and perform searches. π
- **Single Server:** Ensures only one Milvus server is running for efficiency.
- **Indexing:** Automatically creates an index to speed up your searches. π
- **Similarity Search:** Find the most similar images using cosine similarity. π―
## π References: The Brains Behind PicMatch π§
PicMatch leverages these incredible open-source projects:
- **TinyCLIP:** The visual powerhouse for understanding your images.
- π [https://huggingface.co/wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M](https://huggingface.co/wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M)
- **Image Captioning:** The wordsmith that describes your photos in detail.
- π [https://huggingface.co/Salesforce/blip-image-captioning-base](https://huggingface.co/Salesforce/blip-image-captioning-base)
- **Sentence Transformers:** Turns captions into embeddings for text-based search.
- π [https://sbert.net](https://sbert.net)
- **Unsplash:** Images used were taken from Unsplash's open source data
- π [https://github.com/unsplash/datasets](https://github.com/unsplash/datasets)
Let's give credit where credit is due! π These projects make PicMatch smarter and more capable. |