README.md · keras/retinanet_resnet50_fpn_v2

metadata

library_name: keras-hub

Model Overview

A Keras model implementing the RetinaNet meta-architecture.

Implements the RetinaNet architecture for object detection. The constructor requires num_classes, bounding_box_format, and a backbone. Optionally, a custom label encoder, and prediction decoder may be provided.

Installation

Keras and KerasHub can be installed with:

pip install -U -q keras-hub
pip install -U -q keras

Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.

Presets

The following model checkpoints are provided by the Keras team. Full code examples for each are available below.

Preset name	Parameters	Description
retinanet_resnet50_fpn_coco	34.12M	RetinaNet model with ResNet50 backbone fine-tuned on COCO in 800x800 resolution.

Arguments

num_classes: the number of classes in your dataset excluding the background class. Classes should be represented by integers in the range [0, num_classes).
bounding_box_format: The format of bounding boxes of input dataset. Refer to the keras.io docs for more details on supported bounding box formats.
backbone: keras.Model. If the default feature_pyramid is used, must implement the pyramid_level_inputs property with keys "P3", "P4", and "P5" and layer names as values. A somewhat sensible backbone to use in many cases is the: keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet")
anchor_generator: (Optional) a keras_cv.layers.AnchorGenerator. If provided, the anchor generator will be passed to both the label_encoder and the prediction_decoder. Only to be used when both label_encoder and prediction_decoder are both None. Defaults to an anchor generator with the parameterization: strides=[2**i for i in range(3, 8)], scales=[2**x for x in [0, 1 / 3, 2 / 3]], sizes=[32.0, 64.0, 128.0, 256.0, 512.0], and aspect_ratios=[0.5, 1.0, 2.0].
label_encoder: (Optional) a keras.Layer that accepts an image Tensor, a bounding box Tensor and a bounding box class Tensor to its call() method, and returns RetinaNet training targets. By default, a KerasCV standard RetinaNetLabelEncoder is created and used. Results of this object's call() method are passed to the loss object for box_loss and classification_loss the y_true argument.
prediction_decoder: (Optional) A keras.layers.Layer that is responsible for transforming RetinaNet predictions into usable bounding box Tensors. If not provided, a default is provided. The default prediction_decoder layer is a keras_cv.layers.MultiClassNonMaxSuppression layer, which uses a Non-Max Suppression for box pruning.
feature_pyramid: (Optional) A keras.layers.Layer that produces a list of 4D feature maps (batch dimension included) when called on the pyramid-level outputs of the backbone. If not provided, the reference implementation from the paper will be used.
classification_head: (Optional) A keras.Layer that performs classification of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used.
box_head: (Optional) A keras.Layer that performs regression of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used.

Example Usage

Pretrained RetinaNet model

object_detector = keras_hub.models.ImageObjectDetector.from_preset(
    "retinanet_resnet50_fpn_v2_coco"
)

input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)

Fine-tune the pre-trained model

backbone = keras_hub.models.Backbone.from_preset(
    "retinanet_resnet50_fpn_v2_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
    "retinanet_resnet50_fpn_v2_coco"
)
model = RetinaNetObjectDetector(
    backbone=backbone,
    num_classes=len(CLASSES),
    preprocessor=preprocessor
)

Custom training the model

image_converter = keras_hub.layers.RetinaNetImageConverter(
    scale=1/255
)

preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
    image_converter=image_converter
)
# Load a pre-trained ResNet50 model. 
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
    "resnet_50_imagenet" 
)

# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50 
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
    image_encoder=image_encoder,
    min_level=3,
    max_level=5,
    use_p5=False 
)
model = RetinaNetObjectDetector(
    backbone=backbone,
    num_classes=len(CLASSES),
    preprocessor=preprocessor
)

Example Usage with Hugging Face URI

Pretrained RetinaNet model

object_detector = keras_hub.models.ImageObjectDetector.from_preset(
    "hf://keras/retinanet_resnet50_fpn_v2_coco"
)

input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)

Fine-tune the pre-trained model

backbone = keras_hub.models.Backbone.from_preset(
    "hf://keras/retinanet_resnet50_fpn_v2_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
    "hf://keras/retinanet_resnet50_fpn_v2_coco"
)
model = RetinaNetObjectDetector(
    backbone=backbone,
    num_classes=len(CLASSES),
    preprocessor=preprocessor
)

Custom training the model

image_converter = keras_hub.layers.RetinaNetImageConverter(
    scale=1/255
)

preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
    image_converter=image_converter
)
# Load a pre-trained ResNet50 model. 
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
    "resnet_50_imagenet" 
)

# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50 
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
    image_encoder=image_encoder,
    min_level=3,
    max_level=5,
    use_p5=False 
)
model = RetinaNetObjectDetector(
    backbone=backbone,
    num_classes=len(CLASSES),
    preprocessor=preprocessor
)

keras
/

retinanet_resnet50_fpn_v2_coco

Model Overview

Links

Installation

Presets

Example Usage

Pretrained RetinaNet model

Fine-tune the pre-trained model

Custom training the model

Example Usage with Hugging Face URI

Pretrained RetinaNet model

Fine-tune the pre-trained model

Custom training the model