gkioxari commited on
Commit
cdd5e0c
·
1 Parent(s): 08b8294

Create README

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - vision
4
+ - 3D
5
+ - 3D object detection
6
+ datasets:
7
+ - omni3d
8
+ metrics:
9
+ - AP
10
+ ---
11
+
12
+ # 3D Object Detection with Cube R-CNN
13
+
14
+ 3D Object Detection with Cube R-CNN is described in [**Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild**](https://arxiv.org/abs/2207.10660) and released in this [repository](https://github.com/facebookresearch/omni3d)
15
+
16
+ ## Overview
17
+ A description of the model and its architecture are shown below
18
+
19
+ <img src="https://s3.amazonaws.com/moonup/production/uploads/1666115971617-634ededbd049354d7ee4b557.png" width=700px/>
20
+
21
+ ## Training Data
22
+
23
+ Cube R-CNN was trained on Omni3D, a large benchmark for 3D object detection in the wild.
24
+
25
+ ## Demo: Inference on Any Image
26
+
27
+ The model detects objects in 3D from a single image. There are 50 distinct object categories including *car, truck, chair, table, cabinet, books, and many more*.
28
+ The model assumes known focal length for the image in order to predict the right metric scale.
29
+ However, users can provide any focal length and will get predictions on a "relative" scale.
30
+
31
+ For example, we can predict 3D objects from COCO images with a user-defined focal length of 4.0, as shown below
32
+
33
+ <img src="https://github.com/facebookresearch/omni3d/blob/main/.github/generalization_coco.png?raw=true" width=500px/>
34
+
35
+ The above output is produced by our demo
36
+
37
+ ```bash
38
+ python demo/demo.py \
39
+ --config cubercnn://omni3d/cubercnn_DLA34_FPN.yaml \
40
+ --input-folder "datasets/image_inputs" \
41
+ --threshold 0.25 --focal 4.0 --display \
42
+ MODEL.WEIGHTS cubercnn://omni3d/cubercnn_DLA34_FPN.pth \
43
+ OUTPUT_DIR output/demo
44
+ ```
45
+
46
+ ## Checkpoints
47
+
48
+ You can find model checkpoints in the original [model zoo](https://github.com/facebookresearch/omni3d/blob/main/MODEL_ZOO.md).
49
+
50
+ ## Intended Use and Limitations
51
+
52
+ Cube R-CNN is a data-driven method trained on an annotated dataset, Omni3D. The purpose of the project is to advance 3D computer vision and 3D object recognition. The dataset contains a *pedestrian* category, which we acknowledge as a potential issue in the case of unethical applications of our model.
53
+
54
+ The limitations of our approach are: erroneous predictions especially for far away objects, mistakes in predicting rotations and depth. Our evaluation reports an analysis for various depths and object sizes to better understand performance.