metadata
library_name: keras-hub
Model Overview
A Keras model implementing the RetinaNet meta-architecture.
Implements the RetinaNet architecture for object detection. The constructor
requires num_classes
, bounding_box_format
, and a backbone. Optionally,
a custom label encoder, and prediction decoder may be provided.
Links
- RetinaNet Quickstart Notebook
- RetinaNet API Documentation
- RetinaNet Model Card
- KerasHub Beginner Guide
- KerasHub Model Publishing Guide
Installation
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.
Presets
The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
Preset name | Parameters | Description |
---|---|---|
retinanet_resnet50_fpn_coco | 34.12M | RetinaNet model with ResNet50 backbone fine-tuned on COCO in 800x800 resolution. |
Arguments
- num_classes: the number of classes in your dataset excluding the background class. Classes should be represented by integers in the range [0, num_classes).
- bounding_box_format: The format of bounding boxes of input dataset. Refer to the keras.io docs for more details on supported bounding box formats.
- backbone:
keras.Model
. If the defaultfeature_pyramid
is used, must implement thepyramid_level_inputs
property with keys "P3", "P4", and "P5" and layer names as values. A somewhat sensible backbone to use in many cases is the:keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet")
- anchor_generator: (Optional) a
keras_cv.layers.AnchorGenerator
. If provided, the anchor generator will be passed to both thelabel_encoder
and theprediction_decoder
. Only to be used when bothlabel_encoder
andprediction_decoder
are bothNone
. Defaults to an anchor generator with the parameterization:strides=[2**i for i in range(3, 8)]
,scales=[2**x for x in [0, 1 / 3, 2 / 3]]
,sizes=[32.0, 64.0, 128.0, 256.0, 512.0]
, andaspect_ratios=[0.5, 1.0, 2.0]
. - label_encoder: (Optional) a keras.Layer that accepts an image Tensor, a
bounding box Tensor and a bounding box class Tensor to its
call()
method, and returns RetinaNet training targets. By default, a KerasCV standardRetinaNetLabelEncoder
is created and used. Results of this object'scall()
method are passed to theloss
object forbox_loss
andclassification_loss
they_true
argument. - prediction_decoder: (Optional) A
keras.layers.Layer
that is responsible for transforming RetinaNet predictions into usable bounding box Tensors. If not provided, a default is provided. The defaultprediction_decoder
layer is akeras_cv.layers.MultiClassNonMaxSuppression
layer, which uses a Non-Max Suppression for box pruning. - feature_pyramid: (Optional) A
keras.layers.Layer
that produces a list of 4D feature maps (batch dimension included) when called on the pyramid-level outputs of thebackbone
. If not provided, the reference implementation from the paper will be used. - classification_head: (Optional) A
keras.Layer
that performs classification of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used. - box_head: (Optional) A
keras.Layer
that performs regression of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used.
Example Usage
Pretrained RetinaNet model
object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"retinanet_resnet50_fpn_v2_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
Fine-tune the pre-trained model
backbone = keras_hub.models.Backbone.from_preset(
"retinanet_resnet50_fpn_v2_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"retinanet_resnet50_fpn_v2_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
Custom training the model
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
Example Usage with Hugging Face URI
Pretrained RetinaNet model
object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"hf://keras/retinanet_resnet50_fpn_v2_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
Fine-tune the pre-trained model
backbone = keras_hub.models.Backbone.from_preset(
"hf://keras/retinanet_resnet50_fpn_v2_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"hf://keras/retinanet_resnet50_fpn_v2_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
Custom training the model
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)