Instructions to use zeyuren2002/EvalMDE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use zeyuren2002/EvalMDE with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("zeyuren2002/EvalMDE", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| language: | |
| - en | |
| base_model: | |
| - stabilityai/stable-diffusion-2 | |
| pipeline_tag: depth-estimation | |
| <!-- # DepthMaster: Taming Diffusion Models for Monocular Depth Estimation | |
| This repository represents the official implementation of the paper titled "DepthMaster: Taming Diffusion Models for Monocular Depth Estimation". --> | |
| <!-- [](https://marigoldmonodepth.github.io) | |
| [](https://arxiv.org/abs/2312.02145) --> | |
| <!-- [](https://www.apache.org/licenses/LICENSE-2.0) --> | |
| <h1 align="center"><strong>DepthMaster: Taming Diffusion Models for Monocular Depth Estimation</strong></h1> | |
| <p align="center"> | |
| <a href="https://indu1ge.github.io/ziyangsong">Ziyang Song*</a>, | |
| <a href="https://orcid.org/0009-0001-6677-0572">Zerong Wang*</a>, | |
| <a href="https://orcid.org/0000-0001-7817-0665">Bo Li</a>, | |
| <a href="https://orcid.org/0009-0007-1175-5918">Hao Zhang</a>, | |
| <a href="https://ruijiezhu94.github.io/ruijiezhu/">Ruijie Zhu</a>, | |
| <a href="https://orcid.org/0009-0004-3280-8490">Li Liu</a>, | |
| <a href="https://pengtaojiang.github.io/">Peng-Tao Jiang†</a>, | |
| <a href="http://staff.ustc.edu.cn/~tzzhang/">Tianzhu Zhang†</a>, | |
| <br> | |
| *Equal Contribution, †Corresponding Author | |
| <br> | |
| University of Science and Technology of China, vivo Mobile Communication Co., Ltd. | |
| <br> | |
| <b>Arxiv 2025</b> | |
| </p> | |
| <!-- [Ziyang Song*](https://indu1ge.github.io/ziyangsong), | |
| [Zerong Wang*](), | |
| [Bo Li](https://orcid.org/0000-0001-7817-0665), | |
| [Hao Zhang](https://orcid.org/0009-0007-1175-5918), | |
| [Ruijie Zhu](https://ruijiezhu94.github.io/ruijiezhu/), | |
| [Li Liu](https://orcid.org/0009-0004-3280-8490) | |
| [Tianzhu Zhang](http://staff.ustc.edu.cn/~tzzhang/) | |
| [Peng-Tao Jiang](https://pengtaojiang.github.io/) --> | |
| <div align="center"> | |
| <a href='https://arxiv.org/abs/2501.02576'> | |
| <img src='https://img.shields.io/badge/Paper-arXiv-red'> | |
| </a> | |
| <a href='https://indu1ge.github.io/DepthMaster_page/'> | |
| <img src='https://img.shields.io/badge/Project-Page-Green'> | |
| </a> | |
| <a href='https://github.com/indu1ge/DepthMaster'> | |
| <img src='https://img.shields.io/badge/GitHub-Repository-blue?logo=github'> | |
| </a> | |
| <a href='https://www.apache.org/licenses/LICENSE-2.0'> | |
| <img src='https://img.shields.io/badge/License-Apache--2.0-929292'> | |
| </a> | |
| </div> | |
| <!-- We present Marigold, a diffusion model, and associated fine-tuning protocol for monocular depth estimation. Its core principle is to leverage the rich visual knowledge stored in modern generative image models. Our model, derived from Stable Diffusion and fine-tuned with synthetic data, can zero-shot transfer to unseen data, offering state-of-the-art monocular depth estimation results. --> | |
|  | |
| <!-- >We present DepthMaster, a tamed single-step diffusion model designed to enhance the generalization and detail preservation abilities of depth estimation models. Through feature alignment, we effectively prevent the overfitting to texture details. By adaptively enhance --> | |
| >We present DepthMaster, a tamed single-step diffusion model that customizes generative features in diffusion models to suit the discriminative depth estimation task. We introduce a Feature Alignment module to mitigate overfitting to texture and a Fourier Enhancement module to refine fine-grained details. DepthMaster exhibits state-of-the-art zero-shot performance and superior detail preservation ability, surpassing | |
| other diffusion-based methods across various datasets. | |
| ## 🎓 Citation | |
| Please cite our paper: | |
| ```bibtex | |
| @article{song2025depthmaster, | |
| title={DepthMaster: Taming Diffusion Models for Monocular Depth Estimation}, | |
| author={Song, Ziyang and Wang, Zerong and Li, Bo and Zhang, Hao and Zhu, Ruijie and Liu, Li and Jiang, Peng-Tao and Zhang, Tianzhu}, | |
| journal={arXiv preprint arXiv:2501.02576}, | |
| year={2025} | |
| } | |
| ``` | |
| ## Acknowledgements | |
| The code is based on [Marigold](https://github.com/prs-eth/Marigold). | |
| ## 🎫 License | |
| This work is licensed under the Apache License, Version 2.0 (as defined in the [LICENSE](LICENSE.txt)). | |
| By downloading and using the code and model you agree to the terms in the [LICENSsE](LICENSE.txt). | |
| [](https://www.apache.org/licenses/LICENSE-2.0) |