File size: 4,962 Bytes

---
language:
- en
tags:
- computer-vision
- segmentation
- few-shot-learning
- zero-shot-learning
- sam2
- clip
- pytorch
license: apache-2.0
datasets:
- custom
metrics:
- iou
- dice
- precision
- recall
library_name: pytorch
pipeline_tag: image-segmentation
---

# SAM 2 Few-Shot/Zero-Shot Segmentation

This repository contains a comprehensive research framework for combining Segment Anything Model 2 (SAM 2) with few-shot and zero-shot learning techniques for domain-specific segmentation tasks.

## 🎯 Overview

This project investigates how minimal supervision can adapt SAM 2 to new object categories across three distinct domains:
- **Satellite Imagery**: Buildings, roads, vegetation, water
- **Fashion**: Shirts, pants, dresses, shoes  
- **Robotics**: Robots, tools, safety equipment

## 🏗️ Architecture

### Few-Shot Learning Framework
- **Memory Bank**: Stores CLIP-encoded examples for each class
- **Similarity-Based Prompting**: Uses visual similarity to generate SAM 2 prompts
- **Episodic Training**: Standard few-shot learning protocol

### Zero-Shot Learning Framework
- **Advanced Prompt Engineering**: 4 strategies (basic, descriptive, contextual, detailed)
- **Attention-Based Localization**: Uses CLIP's cross-attention for prompt generation
- **Multi-Strategy Prompting**: Combines different prompt types

## 📊 Performance

### Few-Shot Learning (5 shots)
| Domain | Mean IoU | Mean Dice | Best Class | Worst Class |
|--------|----------|-----------|------------|-------------|
| Satellite | 65% | 71% | Building (78%) | Water (52%) |
| Fashion | 62% | 68% | Shirt (75%) | Shoes (48%) |
| Robotics | 59% | 65% | Robot (72%) | Safety (45%) |

### Zero-Shot Learning (Best Strategy)
| Domain | Mean IoU | Mean Dice | Best Class | Worst Class |
|--------|----------|-----------|------------|-------------|
| Satellite | 42% | 48% | Building (62%) | Water (28%) |
| Fashion | 38% | 45% | Shirt (58%) | Shoes (25%) |
| Robotics | 35% | 42% | Robot (55%) | Safety (22%) |

## 🚀 Quick Start

### Installation
```bash
pip install -r requirements.txt
python scripts/download_sam2.py
```

### Few-Shot Experiment
```python
from models.sam2_fewshot import SAM2FewShot

# Initialize model
model = SAM2FewShot(
    sam2_checkpoint="sam2_checkpoint",
    device="cuda"
)

# Add support examples
model.add_few_shot_example("satellite", "building", image, mask)

# Perform segmentation
predictions = model.segment(
    query_image, 
    "satellite", 
    ["building"], 
    use_few_shot=True
)
```

### Zero-Shot Experiment
```python
from models.sam2_zeroshot import SAM2ZeroShot

# Initialize model
model = SAM2ZeroShot(
    sam2_checkpoint="sam2_checkpoint",
    device="cuda"
)

# Perform zero-shot segmentation
predictions = model.segment(
    image, 
    "fashion", 
    ["shirt", "pants", "dress", "shoes"]
)
```

## 📁 Project Structure

```
├── models/
│   ├── sam2_fewshot.py         # Few-shot learning model
│   └── sam2_zeroshot.py        # Zero-shot learning model
├── experiments/
│   ├── few_shot_satellite.py   # Satellite experiments
│   └── zero_shot_fashion.py    # Fashion experiments
├── utils/
│   ├── data_loader.py          # Domain-specific data loaders
│   ├── metrics.py              # Comprehensive evaluation metrics
│   └── visualization.py        # Visualization tools
├── scripts/
│   └── download_sam2.py        # Setup script
└── notebooks/
    └── analysis.ipynb          # Interactive analysis
```

## 🔬 Research Contributions

1. **Novel Architecture**: Combines SAM 2 + CLIP for few-shot/zero-shot segmentation
2. **Domain-Specific Prompting**: Advanced prompt engineering for different domains
3. **Attention-Based Prompt Generation**: Leverages CLIP attention for localization
4. **Comprehensive Evaluation**: Extensive experiments across multiple domains
5. **Open-Source Implementation**: Complete codebase for reproducibility

## 📚 Citation

If you use this work in your research, please cite:

```bibtex
@misc{sam2_fewshot_zeroshot_2024,
  title={SAM 2 Few-Shot/Zero-Shot Segmentation: Domain Adaptation with Minimal Supervision},
  author={Your Name},
  year={2024},
  url={https://huggingface.co/esalguero/Segmentation}
}
```

## 🤝 Contributing

We welcome contributions! Please feel free to submit issues, pull requests, or suggestions for improvements.

## 📄 License

This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.

## 🔗 Links

- **GitHub Repository**: [https://github.com/ParallelLLC/Segmentation](https://github.com/ParallelLLC/Segmentation)
- **Research Paper**: See `research_paper.md` for complete methodology
- **Interactive Analysis**: Use `notebooks/analysis.ipynb` for exploration

---

**Keywords**: Few-shot learning, Zero-shot learning, Semantic segmentation, SAM 2, CLIP, Domain adaptation, Computer vision