NOAA AFSC Marine Mammal Lab YOLOv11 Ice Seal Object Detection Model

Accurate identification of ice-associated seals in aerial imagery is essential for monitoring population dynamics and assessing ecological trends in Arctic and sub-Arctic environments. To facilitate the fast and accurate detection of these seals, I have developed a deep learning-based object detection model utilizing YOLOV11n, a lightweight neural network architecture optimized for high-performance image analysis. The model, comprising 319 layers and 2,591,400 parameters, was trained on a diverse dataset containing 7,671 high-resolution images, which were tiled into 30,684 smaller images to enhance feature recognition.

To improve generalization, data augmentation techniques, including Blur, MedianBlur, ToGray, and CLAHE, were applied. The model was trained at a resolution of 1024×1024 using an SGD optimizer (lr=0.01, momentum=0.9) with a batch size of 113 on an NVIDIA A100 GPU. Eight distinct seal categories, including bearded, ribbon, ringed, and spotted seals in both pup and adult stages, were annotated across the dataset.

The model achieved an overall mean average precision (mAP) of 0.4833, with particularly high performance on bearded seals (0.9 mAP), ringed seals (0.833 mAP), and moderate performance on ribbon seals (0.464 mAP). Pup classes consistently underperformed due to smaller size and limited representation. Evaluation metrics indicate its effectiveness in distinguishing seal species with high precision (0.76783) and recall. Because of this high precision, the model is easily adaptable as an ROI detector by collapsing classes to one during inference and lowering the confidence threshold to 0.01, combined with a well-tuned IoU threshold. Under these settings, it successfully detected 98.27% of all seals, regardless of class. The model was successfully ported to TFLite for edge devices, enabling TPU acceleration and providing accurate inference in near real-time. These findings show the model’s potential for large-scale, automated monitoring of ice-associated seal populations in remote and challenging environments.

Ice Associated Seal Detection;

This model is best at detecting ringed, spotted, and bearded seals. While false negatives are rare, underrepresented classes tend to be misclassified rather than completely missed. This property is beneficial in workflows that incorporate manual review (e.g., imaging surveys). It also supports advanced techniques like clustering and two-shot fine-tuning. For real-time inference, YOLO remains the recommended architecture.

Model Details

Architecture: YOLOv11n
Layers: 319
Parameters: 2,591,400
Gradients: 2,591,384
GFLOPs: 6.4
Classes (nc=8):
1. bearded_pup
2. bearded_seal
3. ribbon_pup
4. ribbon_seal
5. ringed_pup
6. ringed_seal
7. spotted_pup
8. spotted_seal

Dataset and Preprocessing

Source Imagery:
- Original resolution: 6048 × 4032
- Total source images: 7,671 (including 156 null examples)
- Resolution range: ~12.21 MP to ~24.39 MP
Tiling:
- Each source image was split into 4 tiles (2 rows × 2 columns) at 1024 × 1024.
- Post-tiling: 30,684 images.
Annotations:
- 8,948 total annotations.
- ~1.2 annotations per image on average.

Class Distribution

Class Name	Total Count	Training Count	Validation Count	Test Count
ringed_seal	3180	2190	674	316
bearded_seal	1922	1392	359	171
spotted_seal	1662	1142	344	176
unknown_seal	812	585	153	74
bearded_pup	420	300	77	43
unknown_pup	392	275	63	54
spotted_pup	238	174	46	18
ribbon_seal	232	154	45	33
ringed_pup	54	37	14	3
ribbon_pup	36	28	5	3

(Note: Model training specifically uses the 8 classes listed in Model Details.)

Training Configuration

Image Size: 1024 × 1024
Batch Size: 113
Dataloader Workers: 8
Hardware: NVIDIA A100 GPU
Optimizer: SGD
- Learning Rate: 0.01
- Momentum: 0.9
- Parameter Groups:
  - 81 weight (decay=0.0)
  - 88 weight (decay=0.000484375)
  - 87 bias (decay=0.0)
Augmentations (Albumentations):
- Blur(p=0.01, blur_limit=(3, 7))
- MedianBlur(p=0.01, blur_limit=(3, 7))
- ToGray(p=0.01, num_output_channels=3, method='weighted_average')
- CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))

Metrics (Epoch 64)

epoch	time	train/box_loss	train/cls_loss	train/dfl_loss	metrics/precision(B)	metrics/recall(B)	metrics/mAP50(B)	metrics/mAP50-95(B)	val/box_loss	val/cls_loss	val/dfl_loss	lr/pg0	lr/pg1	lr/pg2
64	23230.9	1.34616	1.18894	0.89475	0.76783	0.43806	0.4671	0.30454	1.37059	1.77372	0.90735	0.00993763	0.00993763	0.00993763

Usage & Recommendations

Optimal Performance: Detecting ringed, spotted, and bearded seals.
Underrepresented Classes: Misclassification is more common, though false negatives remain rare.
Review Pipeline: Intended for manual verification in certain workflows.
Downstream Tasks: Clustering, two-shot fine-tuning, or in-depth analytics.
Real-Time Inference: Best to use the YOLO model.

Example Inference Code

Below is a brief example (in Python) showing how you might run inference using the model.predict method from the Ultralytics YOLO interface. This snippet demonstrates usage on a still image, a video file, and a live webcam feed. Adjust paths, thresholds, and device settings as needed.

# Example: Running inference with YOLO for Ice Seal Detection

# 1. Install ultralytics if not already:
# pip install ultralytics

from ultralytics import YOLO

# 2. Load your custom-trained weights
model = YOLO("path/to/NOAA_AFSC_MML_Iceseals_31K.pt")

# 3. Inference on a still image
#    Using a low confidence threshold (e.g., 0.01) is helpful when using the model as an ROI detector 
#    with a well-tuned IoU threshold.
results_image = model.predict(
    source="path/to/your_image.jpg",
    conf=0.01,   # Confidence threshold
    iou=0.45,    # IoU threshold
    device=0     # Use GPU 0 if available; set to 'cpu' if no GPU
)

# 4. Inference on a video file
#    The 'show=True' flag will display the annotated frames on screen in real time.
results_video = model.predict(
    source="path/to/your_video.mp4",
    conf=0.01,
    iou=0.45,
    device=0,
    show=True
)

# 5. Live video stream (e.g., webcam)
#    source=0 typically refers to the default camera. If multiple cameras exist, use 1, 2, etc.
results_live = model.predict(
    source=0,    # Webcam
    conf=0.01,
    iou=0.45,
    device=0,
    show=True
)

# 6. Accessing the results
#    Each 'results_*' object contains predictions (boxes, confidence, class IDs, etc.)
for result in results_image:
    print("Detections on the image:")
    # result.boxes has bounding box data

# Similarly, you can iterate over results in video/live inference if needed for further processing.

Notes:

Confidence Threshold (conf): Lowering it (e.g., 0.01) can help capture most bounding boxes when using the model purely as an ROI detector. You should then rely on a more precise IoU threshold or post-processing to filter out false positives.
IoU Threshold (iou): Increase or decrease depending on how tightly you want bounding boxes to match potential detections.
Real-Time Inference: This demo uses show=True for immediate display; removing it will still run inference but won’t display frames. You can also pipe results into further post-processing, tracking, or saving.
Device: Change device=0 to 'cpu' if you don’t have a GPU available.

Environmental Impact

Compute Location: Google Cloud Platform (GCP), northamerica-northeast1 region
- Carbon efficiency: 0.03 kgCO₂eq/kWh
Hardware: A100 PCIe 40/80GB (TDP of 250W)
Compute Duration: ~10 hours
Total Emissions: ~0.07 kgCO₂eq

Estimation based on the MachineLearning Impact calculator, from Lacoste et al. (2019).

Disclaimer

This repository is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA project content is provided on an ‘as is’ basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.