NOAA AFSC Marine Mammal Lab YOLOv11 Ice Seal Object Detection Model

Accurate identification of ice-associated seals in aerial imagery is essential for monitoring population dynamics and assessing ecological trends in Arctic and sub-Arctic environments. To facilitate the fast and accurate detection of these seals, I have developed a deep learning-based object detection model utilizing YOLOV11n, a lightweight neural network architecture optimized for high-performance image analysis. The model, comprising 319 layers and 2,591,400 parameters, was trained on a diverse dataset containing 7,671 high-resolution images, which were tiled into 30,684 smaller images to enhance feature recognition.

To improve generalization, data augmentation techniques, including Blur, MedianBlur, ToGray, and CLAHE, were applied. The model was trained at a resolution of 1024×1024 using an SGD optimizer (lr=0.01, momentum=0.9) with a batch size of 113 on an NVIDIA A100 GPU. Eight distinct seal categories, including bearded, ribbon, ringed, and spotted seals in both pup and adult stages, were annotated across the dataset.

The model achieved an overall mean average precision (mAP) of 0.4833, with particularly high performance on bearded seals (0.9 mAP), ringed seals (0.833 mAP), and moderate performance on ribbon seals (0.464 mAP). Pup classes consistently underperformed due to smaller size and limited representation. Evaluation metrics indicate its effectiveness in distinguishing seal species with high precision (0.76783) and recall. Because of this high precision, the model is easily adaptable as an ROI detector by collapsing classes to one during inference and lowering the confidence threshold to 0.01, combined with a well-tuned IoU threshold. Under these settings, it successfully detected 98.27% of all seals, regardless of class. The model was successfully ported to TFLite for edge devices, enabling TPU acceleration and providing accurate inference in near real-time. These findings show the model’s potential for large-scale, automated monitoring of ice-associated seal populations in remote and challenging environments.


Ice Associated Seal Detection;

This model is best at detecting ringed, spotted, and bearded seals. While false negatives are rare, underrepresented classes tend to be misclassified rather than completely missed. This property is beneficial in workflows that incorporate manual review (e.g., imaging surveys). It also supports advanced techniques like clustering and two-shot fine-tuning. For real-time inference, YOLO remains the recommended architecture.


Model Details

  • Architecture: YOLOv11n
  • Layers: 319
  • Parameters: 2,591,400
  • Gradients: 2,591,384
  • GFLOPs: 6.4
  • Classes (nc=8):
    1. bearded_pup
    2. bearded_seal
    3. ribbon_pup
    4. ribbon_seal
    5. ringed_pup
    6. ringed_seal
    7. spotted_pup
    8. spotted_seal

Dataset and Preprocessing

  • Source Imagery:
    • Original resolution: 6048 × 4032
    • Total source images: 7,671 (including 156 null examples)
    • Resolution range: ~12.21 MP to ~24.39 MP
  • Tiling:
    • Each source image was split into 4 tiles (2 rows × 2 columns) at 1024 × 1024.
    • Post-tiling: 30,684 images.
  • Annotations:
    • 8,948 total annotations.
    • ~1.2 annotations per image on average.

Class Distribution

Class Name Total Count Training Count Validation Count Test Count
ringed_seal 3180 2190 674 316
bearded_seal 1922 1392 359 171
spotted_seal 1662 1142 344 176
unknown_seal 812 585 153 74
bearded_pup 420 300 77 43
unknown_pup 392 275 63 54
spotted_pup 238 174 46 18
ribbon_seal 232 154 45 33
ringed_pup 54 37 14 3
ribbon_pup 36 28 5 3

(Note: Model training specifically uses the 8 classes listed in Model Details.)


Training Configuration

  • Image Size: 1024 × 1024
  • Batch Size: 113
  • Dataloader Workers: 8
  • Hardware: NVIDIA A100 GPU
  • Optimizer: SGD
    • Learning Rate: 0.01
    • Momentum: 0.9
    • Parameter Groups:
      • 81 weight (decay=0.0)
      • 88 weight (decay=0.000484375)
      • 87 bias (decay=0.0)
  • Augmentations (Albumentations):
    • Blur(p=0.01, blur_limit=(3, 7))
    • MedianBlur(p=0.01, blur_limit=(3, 7))
    • ToGray(p=0.01, num_output_channels=3, method='weighted_average')
    • CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))

Metrics (Epoch 64)

epoch time train/box_loss train/cls_loss train/dfl_loss metrics/precision(B) metrics/recall(B) metrics/mAP50(B) metrics/mAP50-95(B) val/box_loss val/cls_loss val/dfl_loss lr/pg0 lr/pg1 lr/pg2
64 23230.9 1.34616 1.18894 0.89475 0.76783 0.43806 0.4671 0.30454 1.37059 1.77372 0.90735 0.00993763 0.00993763 0.00993763

Usage & Recommendations

  • Optimal Performance: Detecting ringed, spotted, and bearded seals.
  • Underrepresented Classes: Misclassification is more common, though false negatives remain rare.
  • Review Pipeline: Intended for manual verification in certain workflows.
  • Downstream Tasks: Clustering, two-shot fine-tuning, or in-depth analytics.
  • Real-Time Inference: Best to use the YOLO model.

Example Inference Code

Below is a brief example (in Python) showing how you might run inference using the model.predict method from the Ultralytics YOLO interface. This snippet demonstrates usage on a still image, a video file, and a live webcam feed. Adjust paths, thresholds, and device settings as needed.

# Example: Running inference with YOLO for Ice Seal Detection

# 1. Install ultralytics if not already:
# pip install ultralytics

from ultralytics import YOLO

# 2. Load your custom-trained weights
model = YOLO("path/to/NOAA_AFSC_MML_Iceseals_31K.pt")

# 3. Inference on a still image
#    Using a low confidence threshold (e.g., 0.01) is helpful when using the model as an ROI detector 
#    with a well-tuned IoU threshold.
results_image = model.predict(
    source="path/to/your_image.jpg",
    conf=0.01,   # Confidence threshold
    iou=0.45,    # IoU threshold
    device=0     # Use GPU 0 if available; set to 'cpu' if no GPU
)

# 4. Inference on a video file
#    The 'show=True' flag will display the annotated frames on screen in real time.
results_video = model.predict(
    source="path/to/your_video.mp4",
    conf=0.01,
    iou=0.45,
    device=0,
    show=True
)

# 5. Live video stream (e.g., webcam)
#    source=0 typically refers to the default camera. If multiple cameras exist, use 1, 2, etc.
results_live = model.predict(
    source=0,    # Webcam
    conf=0.01,
    iou=0.45,
    device=0,
    show=True
)

# 6. Accessing the results
#    Each 'results_*' object contains predictions (boxes, confidence, class IDs, etc.)
for result in results_image:
    print("Detections on the image:")
    # result.boxes has bounding box data

# Similarly, you can iterate over results in video/live inference if needed for further processing.

Notes:

  • Confidence Threshold (conf): Lowering it (e.g., 0.01) can help capture most bounding boxes when using the model purely as an ROI detector. You should then rely on a more precise IoU threshold or post-processing to filter out false positives.
  • IoU Threshold (iou): Increase or decrease depending on how tightly you want bounding boxes to match potential detections.
  • Real-Time Inference: This demo uses show=True for immediate display; removing it will still run inference but won’t display frames. You can also pipe results into further post-processing, tracking, or saving.
  • Device: Change device=0 to 'cpu' if you don’t have a GPU available.

Environmental Impact

  • Compute Location: Google Cloud Platform (GCP), northamerica-northeast1 region
    • Carbon efficiency: 0.03 kgCO₂eq/kWh
  • Hardware: A100 PCIe 40/80GB (TDP of 250W)
  • Compute Duration: ~10 hours
  • Total Emissions: ~0.07 kgCO₂eq

Estimation based on the MachineLearning Impact calculator, from Lacoste et al. (2019).

Disclaimer

This repository is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA project content is provided on an ‘as is’ basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.

Downloads last month
896
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.