|
--- |
|
license: other |
|
license_name: aplux-model-farm-license |
|
license_link: https://aiot.aidlux.com/api/v1/files/license/model_farm_license_en.pdf |
|
pipeline_tag: depth-estimation |
|
tags: |
|
- AIoT |
|
- QNN |
|
--- |
|
|
|
 |
|
|
|
## Midas-v2: Depth Estimation |
|
|
|
Midas is a deep learning-based monocular depth estimation model that accurately predicts scene depth from a single RGB image without relying on stereo vision or depth sensors. By integrating a hybrid CNN-Transformer architecture and pretraining on diverse datasets (e.g., MegaDepth, KITTI), it achieves strong cross-scene generalization, adapting to complex lighting, occlusions, and varied environments (indoor/outdoor). The model supports dynamic resolution inputs (down to 256x256 pixels) while preserving detail perception, with optimized computational efficiency for real-time performance and lightweight deployment on mobile/edge devices. It is widely used in autonomous driving (obstacle detection), AR/VR (3D reconstruction), and robotic navigation, significantly reducing hardware costs. Ongoing updates (e.g., Midas-v3) enhance small-object recognition and edge accuracy. |
|
|
|
### Source model |
|
|
|
- Input shape: 1x3x256x256 |
|
- Number of parameters: 20.33M |
|
- Model size: 82.17M |
|
- Output shape: 1x1x256x256 |
|
|
|
The source model can be found [here](https://github.com/isl-org/MiDaS) |
|
|
|
## Performance Reference |
|
|
|
Please search model by model name in [Model Farm](https://aiot.aidlux.com/en/models) |
|
|
|
## Inference & Model Conversion |
|
|
|
Please search model by model name in [Model Farm](https://aiot.aidlux.com/en/models) |
|
|
|
## License |
|
|
|
- Source Model: [MIT](https://github.com/isl-org/MiDaS/blob/master/LICENSE) |
|
|
|
- Deployable Model: [APLUX-MODEL-FARM-LICENSE](https://aiot.aidlux.com/api/v1/files/license/model_farm_license_en.pdf) |