The official implementation is available on GitHub.
Zero-Shot Depth from Defocus
Yiming Zuo* Β· Hongyu Wen* Β· Venkat Subramanian* Β· Patrick Chen Β· Karhan Kayan Β· Mario Bijelic Β· Felix Heide Β· Jia Deng
(*Equal Contribution)
Princeton Vision & Learning Lab (PVL)
Paper Β· Project
Roadmap
- β Release FOSSA training code
- β Release FOSSA evaluation code
- β Release ZEDD dataset and test server
Installation & Setup
Step 1: Create and activate conda environment
conda create -n fossa python=3.8
conda activate fossa
Step 2: Install Dependencies
pip install -r requirements.txt
Step 3: Build PowerExpPSF CUDA Extension
This is required for training and evaluation with synthetic defocus effects.
Build steps
cd power_exp_psf
# Build and install the extension
python setup.py build_ext --inplace
# Verify successful installation
python - <<'PY'
import torch
try:
import power_exp_psf_cuda
import os
path = power_exp_psf.__file__
if os.path.exists(path):
print(f"SUCCESS: power_exp_psf_cuda loaded from {path}")
else:
print(f"ERROR: module loaded but file does not exist at {path}")
except Exception as e:
print(f"IMPORT FAILED: {e}")
PY
cd ..
# Add power_exp_psf as a search directory for imports
export PYTHONPATH=$PWD/power_exp_psf:$PYTHONPATH
Step 4: Load datasets into dataset/datasets
Datasets download instructions
π¦ HAMMER
Download: HAMMER Dataset prepared by MoGe2.
cd dataset/datasets
wget https://huggingface.co/datasets/Ruicheng/monocular-geometry-evaluation/resolve/main/HAMMER.zip
unzip HAMMER.zip
rm -f HAMMER.zip
cd ../..
π¦ DDFF-12
Data split
cd dataset/datasets
mkdir ddff12_val_generation
cd ddff12_val_generation
mkdir third_part
Then, in your browser, navigate to the DFV Split (MS Sharepoint) prepared by DFF-DFV.
Click the download button. Then, copy the downloaded "my_ddff_trainVal.h5" file into dataset/datasets/ddff12_val_generation and rename it to "dfv_trainVal.h5".
Intrinsics matrix:
The intrinsics matrix is also provided by DFV(.mat file).
Download the "raw file" in the GitHub UI and place the downloaded IntParamLF.mat at "dataset/datasets/ddff_val_generation/third_part/".
Datasets that are loaded from HuggingFace (no user downloading necessary)
The following datasets are automatically managed by the code from Hugging Face:
Validation Quickstart
Running Validation
The easiest way to validate is using the distributed validation script:
bash dist_val.sh --encoder [VITS/VITB] --resumed_from [NAME OF PARAMETERS] --val_loader_config_choice [VAL_CONFIG_CHOICE]
Model Loading Options
Option 1: Load from HuggingFace Hub (recommended)
resumed_from='model_name' # automatically pull from venkatsubra/model_name
Option 2: Load from local path
resumed_from='/path/to/model.pth'
Troubleshooting
PowerExpPSF building
β Error: nvcc not found / CUDA extension build fails
If you see an error like: "error: [Errno 2] No such file or directory: '/usr/local/cuda-12.1/bin/nvcc'" or "nvcc not found", this means your environment does not have a CUDA toolkit with nvcc available.
β Fix: Load a valid CUDA toolkit and set environment variables
export CUDA_HOME=/usr/local/cuda-12.6
export PATH="$CUDA_HOME/bin:$PATH"
export LD_LIBRARY_PATH="$CUDA_HOME/lib64:$LD_LIBRARY_PATH"
Acknowledgments
This codebase is partially based on Depth Anything v2, Video Depth Anything, DFF-DFV, and Unsupervised Depth from Focus.
Citation
@article{ZeroShotDepthFromDefocus,
author = {Zuo, Yiming and Wen, Hongyu and Subramanian, Venkat and Chen, Patrick and Kayan, Karhan and Bijelic, Mario and Heide, Felix and Deng, Jia},
title = {Zero-Shot Depth from Defocus},
journal = {arXiv preprint arXiv:2603.26658},
year = {2026},
url = {https://arxiv.org/abs/2603.26658}
}