|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: depth-estimation |
|
new_version: prs-eth/marigold-depth-v1-1 |
|
pinned: true |
|
tags: |
|
- depth estimation |
|
- image analysis |
|
- computer vision |
|
- in-the-wild |
|
- zero-shot |
|
--- |
|
|
|
<h1 align="center">Marigold Depth v1-0 Model Card</h1> |
|
|
|
<p align="center"> |
|
<a title="Image Depth" href="https://huggingface.co/spaces/prs-eth/marigold" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> |
|
<img src="https://img.shields.io/badge/%F0%9F%A4%97%20Image%20Depth%20-Demo-yellow" alt="Image Depth"> |
|
</a> |
|
<a title="diffusers" href="https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> |
|
<img src="https://img.shields.io/badge/%F0%9F%A4%97%20diffusers%20-Integration%20🧨-yellow" alt="diffusers"> |
|
</a> |
|
<a title="Github" href="https://github.com/prs-eth/marigold" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> |
|
<img src="https://img.shields.io/github/stars/prs-eth/marigold?label=GitHub%20%E2%98%85&logo=github&color=C8C" alt="Github"> |
|
</a> |
|
<a title="Website" href="https://marigoldmonodepth.github.io/" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> |
|
<img src="https://img.shields.io/badge/%E2%99%A5%20Project%20-Website-blue" alt="Website"> |
|
</a> |
|
<a title="arXiv" href="https://arxiv.org/abs/2312.02145" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> |
|
<img src="https://img.shields.io/badge/%F0%9F%93%84%20Read%20-Paper-AF3436" alt="arXiv"> |
|
</a> |
|
<a title="Social" href="https://twitter.com/antonobukhov1" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> |
|
<img src="https://img.shields.io/twitter/follow/:?label=Subscribe%20for%20updates!" alt="Social"> |
|
</a> |
|
<a title="License" href="https://www.apache.org/licenses/LICENSE-2.0" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> |
|
<img src="https://img.shields.io/badge/License-Apache--2.0-929292" alt="License"> |
|
</a> |
|
</p> |
|
|
|
<h2 align="center"> |
|
<a href="https://huggingface.co/prs-eth/marigold-depth-v1-1">NEW: Marigold Depth v1-1 Model</a> |
|
</h2> |
|
|
|
This is a model card for the `marigold-depth-v1-0` model for monocular depth estimation from a single image. |
|
The model is fine-tuned from the `stable-diffusion-2` [model](https://huggingface.co/stabilityai/stable-diffusion-2) as |
|
described in our [CVPR'2024 paper](https://arxiv.org/abs/2312.02145) titled "Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation". |
|
|
|
- Play with the interactive [Hugging Face Spaces demo](https://huggingface.co/spaces/prs-eth/marigold): check out how the model works with example images or upload your own. |
|
- Use it with [diffusers](https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage) to compute the results with a few lines of code. |
|
- Get to the bottom of things with our [official codebase](https://github.com/prs-eth/marigold). |
|
|
|
## Model Details |
|
- **Developed by:** [Bingxin Ke](http://www.kebingxin.com/), [Anton Obukhov](https://www.obukhov.ai/), [Shengyu Huang](https://shengyuh.github.io/), [Nando Metzger](https://nandometzger.github.io/), [Rodrigo Caye Daudt](https://rcdaudt.github.io/), [Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ). |
|
- **Model type:** Generative latent diffusion-based affine-invariant monocular depth estimation from a single image. |
|
- **Language:** English. |
|
- **License:** [Apache License License Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). |
|
- **Model Description:** This model can be used to generate an estimated depth map of an input image. |
|
- **Resolution**: Even though any resolution can be processed, the model inherits the base diffusion model's effective resolution of roughly **768** pixels. |
|
This means that for optimal predictions, any larger input image should be resized to make the longer side 768 pixels before feeding it into the model. |
|
- **Steps and scheduler**: This model was designed for usage with the **DDIM** scheduler and between **10 and 50** denoising steps. |
|
It is possible to obtain good predictions with just **one** step by overriding the `"timestep_spacing": "trailing"` setting |
|
in the [scheduler configuration file](scheduler/scheduler_config.json) or by adding `pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")` |
|
after the pipeline is loaded in the code before the first usage. For compatibility reasons we kept this `v1-0` model identical to the paper setting and provided a |
|
[newer v1-1 model](https://huggingface.co/prs-eth/marigold-depth-v1-1) with optimal settings for all possible step configurations. |
|
- **Outputs**: |
|
- **Affine-invariant depth map**: The predicted values are between 0 and 1, interpolating between the near and far planes of the model's choice. |
|
- **Uncertainty map**: Produced only when multiple predictions are ensembled with ensemble size larger than 2. |
|
- **Resources for more information:** [Project Website](https://marigoldmonodepth.github.io/), [Paper](https://arxiv.org/abs/2312.02145), [Code](https://github.com/prs-eth/marigold). |
|
- **Cite as:** |
|
|
|
```bibtex |
|
@InProceedings{ke2023repurposing, |
|
title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation}, |
|
author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler}, |
|
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, |
|
year={2024} |
|
} |
|
``` |