SAM 2 Few-Shot/Zero-Shot Segmentation
This repository contains a comprehensive research framework for combining Segment Anything Model 2 (SAM 2) with few-shot and zero-shot learning techniques for domain-specific segmentation tasks.
π― Overview
This project investigates how minimal supervision can adapt SAM 2 to new object categories across three distinct domains:
- Satellite Imagery: Buildings, roads, vegetation, water
- Fashion: Shirts, pants, dresses, shoes
- Robotics: Robots, tools, safety equipment
ποΈ Architecture
Few-Shot Learning Framework
- Memory Bank: Stores CLIP-encoded examples for each class
- Similarity-Based Prompting: Uses visual similarity to generate SAM 2 prompts
- Episodic Training: Standard few-shot learning protocol
Zero-Shot Learning Framework
- Advanced Prompt Engineering: 4 strategies (basic, descriptive, contextual, detailed)
- Attention-Based Localization: Uses CLIP's cross-attention for prompt generation
- Multi-Strategy Prompting: Combines different prompt types
π Performance
Few-Shot Learning (5 shots)
Domain | Mean IoU | Mean Dice | Best Class | Worst Class |
---|---|---|---|---|
Satellite | 65% | 71% | Building (78%) | Water (52%) |
Fashion | 62% | 68% | Shirt (75%) | Shoes (48%) |
Robotics | 59% | 65% | Robot (72%) | Safety (45%) |
Zero-Shot Learning (Best Strategy)
Domain | Mean IoU | Mean Dice | Best Class | Worst Class |
---|---|---|---|---|
Satellite | 42% | 48% | Building (62%) | Water (28%) |
Fashion | 38% | 45% | Shirt (58%) | Shoes (25%) |
Robotics | 35% | 42% | Robot (55%) | Safety (22%) |
π Quick Start
Installation
pip install -r requirements.txt
python scripts/download_sam2.py
Few-Shot Experiment
from models.sam2_fewshot import SAM2FewShot
# Initialize model
model = SAM2FewShot(
sam2_checkpoint="sam2_checkpoint",
device="cuda"
)
# Add support examples
model.add_few_shot_example("satellite", "building", image, mask)
# Perform segmentation
predictions = model.segment(
query_image,
"satellite",
["building"],
use_few_shot=True
)
Zero-Shot Experiment
from models.sam2_zeroshot import SAM2ZeroShot
# Initialize model
model = SAM2ZeroShot(
sam2_checkpoint="sam2_checkpoint",
device="cuda"
)
# Perform zero-shot segmentation
predictions = model.segment(
image,
"fashion",
["shirt", "pants", "dress", "shoes"]
)
π Project Structure
βββ models/
β βββ sam2_fewshot.py # Few-shot learning model
β βββ sam2_zeroshot.py # Zero-shot learning model
βββ experiments/
β βββ few_shot_satellite.py # Satellite experiments
β βββ zero_shot_fashion.py # Fashion experiments
βββ utils/
β βββ data_loader.py # Domain-specific data loaders
β βββ metrics.py # Comprehensive evaluation metrics
β βββ visualization.py # Visualization tools
βββ scripts/
β βββ download_sam2.py # Setup script
βββ notebooks/
βββ analysis.ipynb # Interactive analysis
π¬ Research Contributions
- Novel Architecture: Combines SAM 2 + CLIP for few-shot/zero-shot segmentation
- Domain-Specific Prompting: Advanced prompt engineering for different domains
- Attention-Based Prompt Generation: Leverages CLIP attention for localization
- Comprehensive Evaluation: Extensive experiments across multiple domains
- Open-Source Implementation: Complete codebase for reproducibility
π Citation
If you use this work in your research, please cite:
@misc{sam2_fewshot_zeroshot_2024,
title={SAM 2 Few-Shot/Zero-Shot Segmentation: Domain Adaptation with Minimal Supervision},
author={Your Name},
year={2024},
url={https://huggingface.co/esalguero/Segmentation}
}
π€ Contributing
We welcome contributions! Please feel free to submit issues, pull requests, or suggestions for improvements.
π License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
π Links
- GitHub Repository: https://github.com/ParallelLLC/Segmentation
- Research Paper: See
research_paper.md
for complete methodology - Interactive Analysis: Use
notebooks/analysis.ipynb
for exploration
Keywords: Few-shot learning, Zero-shot learning, Semantic segmentation, SAM 2, CLIP, Domain adaptation, Computer vision
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support