SAM 2 Few-Shot/Zero-Shot Segmentation

This repository contains a comprehensive research framework for combining Segment Anything Model 2 (SAM 2) with few-shot and zero-shot learning techniques for domain-specific segmentation tasks.

🎯 Overview

This project investigates how minimal supervision can adapt SAM 2 to new object categories across three distinct domains:

  • Satellite Imagery: Buildings, roads, vegetation, water
  • Fashion: Shirts, pants, dresses, shoes
  • Robotics: Robots, tools, safety equipment

πŸ—οΈ Architecture

Few-Shot Learning Framework

  • Memory Bank: Stores CLIP-encoded examples for each class
  • Similarity-Based Prompting: Uses visual similarity to generate SAM 2 prompts
  • Episodic Training: Standard few-shot learning protocol

Zero-Shot Learning Framework

  • Advanced Prompt Engineering: 4 strategies (basic, descriptive, contextual, detailed)
  • Attention-Based Localization: Uses CLIP's cross-attention for prompt generation
  • Multi-Strategy Prompting: Combines different prompt types

πŸ“Š Performance

Few-Shot Learning (5 shots)

Domain Mean IoU Mean Dice Best Class Worst Class
Satellite 65% 71% Building (78%) Water (52%)
Fashion 62% 68% Shirt (75%) Shoes (48%)
Robotics 59% 65% Robot (72%) Safety (45%)

Zero-Shot Learning (Best Strategy)

Domain Mean IoU Mean Dice Best Class Worst Class
Satellite 42% 48% Building (62%) Water (28%)
Fashion 38% 45% Shirt (58%) Shoes (25%)
Robotics 35% 42% Robot (55%) Safety (22%)

πŸš€ Quick Start

Installation

pip install -r requirements.txt
python scripts/download_sam2.py

Few-Shot Experiment

from models.sam2_fewshot import SAM2FewShot

# Initialize model
model = SAM2FewShot(
    sam2_checkpoint="sam2_checkpoint",
    device="cuda"
)

# Add support examples
model.add_few_shot_example("satellite", "building", image, mask)

# Perform segmentation
predictions = model.segment(
    query_image, 
    "satellite", 
    ["building"], 
    use_few_shot=True
)

Zero-Shot Experiment

from models.sam2_zeroshot import SAM2ZeroShot

# Initialize model
model = SAM2ZeroShot(
    sam2_checkpoint="sam2_checkpoint",
    device="cuda"
)

# Perform zero-shot segmentation
predictions = model.segment(
    image, 
    "fashion", 
    ["shirt", "pants", "dress", "shoes"]
)

πŸ“ Project Structure

β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ sam2_fewshot.py         # Few-shot learning model
β”‚   └── sam2_zeroshot.py        # Zero-shot learning model
β”œβ”€β”€ experiments/
β”‚   β”œβ”€β”€ few_shot_satellite.py   # Satellite experiments
β”‚   └── zero_shot_fashion.py    # Fashion experiments
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ data_loader.py          # Domain-specific data loaders
β”‚   β”œβ”€β”€ metrics.py              # Comprehensive evaluation metrics
β”‚   └── visualization.py        # Visualization tools
β”œβ”€β”€ scripts/
β”‚   └── download_sam2.py        # Setup script
└── notebooks/
    └── analysis.ipynb          # Interactive analysis

πŸ”¬ Research Contributions

  1. Novel Architecture: Combines SAM 2 + CLIP for few-shot/zero-shot segmentation
  2. Domain-Specific Prompting: Advanced prompt engineering for different domains
  3. Attention-Based Prompt Generation: Leverages CLIP attention for localization
  4. Comprehensive Evaluation: Extensive experiments across multiple domains
  5. Open-Source Implementation: Complete codebase for reproducibility

πŸ“š Citation

If you use this work in your research, please cite:

@misc{sam2_fewshot_zeroshot_2024,
  title={SAM 2 Few-Shot/Zero-Shot Segmentation: Domain Adaptation with Minimal Supervision},
  author={Your Name},
  year={2024},
  url={https://huggingface.co/esalguero/Segmentation}
}

🀝 Contributing

We welcome contributions! Please feel free to submit issues, pull requests, or suggestions for improvements.

πŸ“„ License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

πŸ”— Links


Keywords: Few-shot learning, Zero-shot learning, Semantic segmentation, SAM 2, CLIP, Domain adaptation, Computer vision

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support