Create README
Browse files
README.md
ADDED
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- vision
|
4 |
+
- 3D
|
5 |
+
- 3D object detection
|
6 |
+
datasets:
|
7 |
+
- omni3d
|
8 |
+
metrics:
|
9 |
+
- AP
|
10 |
+
---
|
11 |
+
|
12 |
+
# 3D Object Detection with Cube R-CNN
|
13 |
+
|
14 |
+
3D Object Detection with Cube R-CNN is described in [**Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild**](https://arxiv.org/abs/2207.10660) and released in this [repository](https://github.com/facebookresearch/omni3d)
|
15 |
+
|
16 |
+
## Overview
|
17 |
+
A description of the model and its architecture are shown below
|
18 |
+
|
19 |
+
<img src="https://s3.amazonaws.com/moonup/production/uploads/1666115971617-634ededbd049354d7ee4b557.png" width=700px/>
|
20 |
+
|
21 |
+
## Training Data
|
22 |
+
|
23 |
+
Cube R-CNN was trained on Omni3D, a large benchmark for 3D object detection in the wild.
|
24 |
+
|
25 |
+
## Demo: Inference on Any Image
|
26 |
+
|
27 |
+
The model detects objects in 3D from a single image. There are 50 distinct object categories including *car, truck, chair, table, cabinet, books, and many more*.
|
28 |
+
The model assumes known focal length for the image in order to predict the right metric scale.
|
29 |
+
However, users can provide any focal length and will get predictions on a "relative" scale.
|
30 |
+
|
31 |
+
For example, we can predict 3D objects from COCO images with a user-defined focal length of 4.0, as shown below
|
32 |
+
|
33 |
+
<img src="https://github.com/facebookresearch/omni3d/blob/main/.github/generalization_coco.png?raw=true" width=500px/>
|
34 |
+
|
35 |
+
The above output is produced by our demo
|
36 |
+
|
37 |
+
```bash
|
38 |
+
python demo/demo.py \
|
39 |
+
--config cubercnn://omni3d/cubercnn_DLA34_FPN.yaml \
|
40 |
+
--input-folder "datasets/image_inputs" \
|
41 |
+
--threshold 0.25 --focal 4.0 --display \
|
42 |
+
MODEL.WEIGHTS cubercnn://omni3d/cubercnn_DLA34_FPN.pth \
|
43 |
+
OUTPUT_DIR output/demo
|
44 |
+
```
|
45 |
+
|
46 |
+
## Checkpoints
|
47 |
+
|
48 |
+
You can find model checkpoints in the original [model zoo](https://github.com/facebookresearch/omni3d/blob/main/MODEL_ZOO.md).
|
49 |
+
|
50 |
+
## Intended Use and Limitations
|
51 |
+
|
52 |
+
Cube R-CNN is a data-driven method trained on an annotated dataset, Omni3D. The purpose of the project is to advance 3D computer vision and 3D object recognition. The dataset contains a *pedestrian* category, which we acknowledge as a potential issue in the case of unethical applications of our model.
|
53 |
+
|
54 |
+
The limitations of our approach are: erroneous predictions especially for far away objects, mistakes in predicting rotations and depth. Our evaluation reports an analysis for various depths and object sizes to better understand performance.
|