YOLOv12n LiDAR BEV Object Detection Model

Model Overview

This is a custom-trained YOLOv12n model for object detection on Bird’s Eye View (BEV) RGB images generated from LiDAR 3D point cloud data. The dataset used for training is derived from the KITTI dataset, converted from raw LiDAR point cloud data to 2D BEV images.

Dataset

Source: KITTI Dataset
Preprocessing: LiDAR point clouds converted into 2D RGB BEV images
Custom Labels: Created for training

Training Details

Training Platform: Kaggle Notebook
Epochs: 300 (Continual learning)
Batch Size: 32
Input Image Size: 608 × 608
Compute: 2× NVIDIA T4 GPUs (Distributed Training)
Training Time: 14.5 hours
Optimizer: AdamW

Data Augmentation & Training Arguments

The model was trained with the following augmentations and hyperparameters:

results = model.train(
    data=os.path.join(Dataset_folder, "data.yaml"),
    epochs=500,
    imgsz=608,
    plots=True,
    batch=batch_size,
    save=True,
    save_period=100,
    device="cuda",
    workers=4,
    project=Folder_name,
    seed=2005,
    copy_paste=0.15,
    optimizer="AdamW",
    mosaic=1.0,
    scale=0.9,
    verbose=True,
    resume=True,
    patience=100,
    cache=True,
    amp=True
)

Usage

To use this model for inference, load it using the Ultralytics YOLOv12 framework:

from ultralytics import YOLO

model = YOLO("path/to/your/yolov12n.pt")
results = model("path/to/your/image.jpg")
results.show()

Performance & Applications

Designed for autonomous driving and LiDAR-based perception
Capable of detecting objects from BEV RGB images derived from 3D LiDAR data
Suitable for real-time object detection in self-driving applications

License

language

english

metrics

mean_iou

pipeline_tag

object-detection

zakskyfighter
/

RGB_BEV_Kitti_Custom_Yolov12n