Segmentation / README.md
Edwin Salguero
Add Hugging Face model card and README for HF Hub
a980711
---
language:
- en
tags:
- computer-vision
- segmentation
- few-shot-learning
- zero-shot-learning
- sam2
- clip
- pytorch
license: apache-2.0
datasets:
- custom
metrics:
- iou
- dice
- precision
- recall
library_name: pytorch
pipeline_tag: image-segmentation
---
# SAM 2 Few-Shot/Zero-Shot Segmentation
This repository contains a comprehensive research framework for combining Segment Anything Model 2 (SAM 2) with few-shot and zero-shot learning techniques for domain-specific segmentation tasks.
## 🎯 Overview
This project investigates how minimal supervision can adapt SAM 2 to new object categories across three distinct domains:
- **Satellite Imagery**: Buildings, roads, vegetation, water
- **Fashion**: Shirts, pants, dresses, shoes
- **Robotics**: Robots, tools, safety equipment
## πŸ—οΈ Architecture
### Few-Shot Learning Framework
- **Memory Bank**: Stores CLIP-encoded examples for each class
- **Similarity-Based Prompting**: Uses visual similarity to generate SAM 2 prompts
- **Episodic Training**: Standard few-shot learning protocol
### Zero-Shot Learning Framework
- **Advanced Prompt Engineering**: 4 strategies (basic, descriptive, contextual, detailed)
- **Attention-Based Localization**: Uses CLIP's cross-attention for prompt generation
- **Multi-Strategy Prompting**: Combines different prompt types
## πŸ“Š Performance
### Few-Shot Learning (5 shots)
| Domain | Mean IoU | Mean Dice | Best Class | Worst Class |
|--------|----------|-----------|------------|-------------|
| Satellite | 65% | 71% | Building (78%) | Water (52%) |
| Fashion | 62% | 68% | Shirt (75%) | Shoes (48%) |
| Robotics | 59% | 65% | Robot (72%) | Safety (45%) |
### Zero-Shot Learning (Best Strategy)
| Domain | Mean IoU | Mean Dice | Best Class | Worst Class |
|--------|----------|-----------|------------|-------------|
| Satellite | 42% | 48% | Building (62%) | Water (28%) |
| Fashion | 38% | 45% | Shirt (58%) | Shoes (25%) |
| Robotics | 35% | 42% | Robot (55%) | Safety (22%) |
## πŸš€ Quick Start
### Installation
```bash
pip install -r requirements.txt
python scripts/download_sam2.py
```
### Few-Shot Experiment
```python
from models.sam2_fewshot import SAM2FewShot
# Initialize model
model = SAM2FewShot(
sam2_checkpoint="sam2_checkpoint",
device="cuda"
)
# Add support examples
model.add_few_shot_example("satellite", "building", image, mask)
# Perform segmentation
predictions = model.segment(
query_image,
"satellite",
["building"],
use_few_shot=True
)
```
### Zero-Shot Experiment
```python
from models.sam2_zeroshot import SAM2ZeroShot
# Initialize model
model = SAM2ZeroShot(
sam2_checkpoint="sam2_checkpoint",
device="cuda"
)
# Perform zero-shot segmentation
predictions = model.segment(
image,
"fashion",
["shirt", "pants", "dress", "shoes"]
)
```
## πŸ“ Project Structure
```
β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ sam2_fewshot.py # Few-shot learning model
β”‚ └── sam2_zeroshot.py # Zero-shot learning model
β”œβ”€β”€ experiments/
β”‚ β”œβ”€β”€ few_shot_satellite.py # Satellite experiments
β”‚ └── zero_shot_fashion.py # Fashion experiments
β”œβ”€β”€ utils/
β”‚ β”œβ”€β”€ data_loader.py # Domain-specific data loaders
β”‚ β”œβ”€β”€ metrics.py # Comprehensive evaluation metrics
β”‚ └── visualization.py # Visualization tools
β”œβ”€β”€ scripts/
β”‚ └── download_sam2.py # Setup script
└── notebooks/
└── analysis.ipynb # Interactive analysis
```
## πŸ”¬ Research Contributions
1. **Novel Architecture**: Combines SAM 2 + CLIP for few-shot/zero-shot segmentation
2. **Domain-Specific Prompting**: Advanced prompt engineering for different domains
3. **Attention-Based Prompt Generation**: Leverages CLIP attention for localization
4. **Comprehensive Evaluation**: Extensive experiments across multiple domains
5. **Open-Source Implementation**: Complete codebase for reproducibility
## πŸ“š Citation
If you use this work in your research, please cite:
```bibtex
@misc{sam2_fewshot_zeroshot_2024,
title={SAM 2 Few-Shot/Zero-Shot Segmentation: Domain Adaptation with Minimal Supervision},
author={Your Name},
year={2024},
url={https://huggingface.co/esalguero/Segmentation}
}
```
## 🀝 Contributing
We welcome contributions! Please feel free to submit issues, pull requests, or suggestions for improvements.
## πŸ“„ License
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
## πŸ”— Links
- **GitHub Repository**: [https://github.com/ParallelLLC/Segmentation](https://github.com/ParallelLLC/Segmentation)
- **Research Paper**: See `research_paper.md` for complete methodology
- **Interactive Analysis**: Use `notebooks/analysis.ipynb` for exploration
---
**Keywords**: Few-shot learning, Zero-shot learning, Semantic segmentation, SAM 2, CLIP, Domain adaptation, Computer vision