# Use Pure Point Cloud Dataset ## Data Pre-Processing ### Convert Point cloud format Currently, we only support bin format point cloud training and inference, before training on your own datasets, you need to transform your point cloud format to bin file. The common point cloud data formats include pcd and las, we provide some open-source tools for reference. 1. Convert pcd to bin: https://github.com/leofansq/Tools_RosBag2KITTI 2. Convert las to bin: The common conversion path is las -> pcd -> bin, and the conversion from las -> pcd can be achieved through [this tool](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor). ### Point cloud annotation MMDetection3D does not support point cloud annotation. Some open-source annotation tools are offered for reference: - [SUSTechPOINTS](https://github.com/naurril/SUSTechPOINTS) - [LATTE](https://github.com/bernwang/latte) Besides, we improved [LATTE](https://github.com/bernwang/latte) for better usage. More details can be found [here](https://arxiv.org/abs/2011.10174). ## Support new data format To support a new data format, you can either convert them to existing formats or directly convert them to the middle format. You could also choose to convert them offline (before training by a script) or online (implement a new dataset and do the conversion at training). ### Reorganize new data formats to existing format Once your datasets only contain point cloud file and 3D Bounding box annotations, without calib file. We recommend converting it into the basic formats, the annotations files in basic format has the following necessary keys: ```python [ {'sample_idx': 'lidar_points': {'lidar_path': velodyne_path, .... }, 'annos': {'box_type_3d': (str) 'LiDAR/Camera/Depth' 'gt_bboxes_3d': (n, 7) 'gt_names': [list] .... } 'calib': { .....} 'images': { .....} } ] ``` In MMDetection3D, for the data that is inconvenient to read directly online, we recommend converting it into into basic format as above and do the conversion offline, thus you only need to modify the config's data annotation paths and classes after the conversion. To use data that share a similar format as the existing datasets, e.g., Lyft has a similar format as the nuScenes dataset, we recommend directly implementing a new data converter and a dataset class to convert the data and load the data, respectively. In this procedure, the code can inherit from the existing dataset classes to reuse the code. ### Reorganize new data format to middle format There is also a way if users do not want to convert the annotation format to existing formats. Actually, we convert all the supported datasets into pickle files, which summarize useful information for model training and inference. The annotation of a dataset is a list of dict, each dict corresponds to a frame. A basic example (used in KITTI) is as follows. A frame consists of several keys, like `image`, `point_cloud`, `calib` and `annos`. As long as we could directly read data according to these information, the organization of raw data could also be different from existing ones. With this design, we provide an alternative choice for customizing datasets. ```python [ {'image': {'image_idx': 0, 'image_path': 'training/image_2/000000.png', 'image_shape': array([ 370, 1224], dtype=int32)}, 'point_cloud': {'num_features': 4, 'velodyne_path': 'training/velodyne/000000.bin'}, 'calib': {'P0': array([[707.0493, 0. , 604.0814, 0. ], [ 0. , 707.0493, 180.5066, 0. ], [ 0. , 0. , 1. , 0. ], [ 0. , 0. , 0. , 1. ]]), 'P1': array([[ 707.0493, 0. , 604.0814, -379.7842], [ 0. , 707.0493, 180.5066, 0. ], [ 0. , 0. , 1. , 0. ], [ 0. , 0. , 0. , 1. ]]), 'P2': array([[ 7.070493e+02, 0.000000e+00, 6.040814e+02, 4.575831e+01], [ 0.000000e+00, 7.070493e+02, 1.805066e+02, -3.454157e-01], [ 0.000000e+00, 0.000000e+00, 1.000000e+00, 4.981016e-03], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]]), 'P3': array([[ 7.070493e+02, 0.000000e+00, 6.040814e+02, -3.341081e+02], [ 0.000000e+00, 7.070493e+02, 1.805066e+02, 2.330660e+00], [ 0.000000e+00, 0.000000e+00, 1.000000e+00, 3.201153e-03], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]]), 'R0_rect': array([[ 0.9999128 , 0.01009263, -0.00851193, 0. ], [-0.01012729, 0.9999406 , -0.00403767, 0. ], [ 0.00847068, 0.00412352, 0.9999556 , 0. ], [ 0. , 0. , 0. , 1. ]]), 'Tr_velo_to_cam': array([[ 0.00692796, -0.9999722 , -0.00275783, -0.02457729], [-0.00116298, 0.00274984, -0.9999955 , -0.06127237], [ 0.9999753 , 0.00693114, -0.0011439 , -0.3321029 ], [ 0. , 0. , 0. , 1. ]]), 'Tr_imu_to_velo': array([[ 9.999976e-01, 7.553071e-04, -2.035826e-03, -8.086759e-01], [-7.854027e-04, 9.998898e-01, -1.482298e-02, 3.195559e-01], [ 2.024406e-03, 1.482454e-02, 9.998881e-01, -7.997231e-01], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]])}, 'annos': {'name': array(['Pedestrian'], dtype='