FE2E: From Editor to Dense Geometry Estimator

Jiyuan Wang^1,2, Chunyu Lin^1✉, Lei Sun^2✝, Rongying Liu¹, Mingxing Li², Lang Nie³, Kang Liao⁴, Xiangxiang Chu², Yao Zhao¹

¹Beijing Jiaotong University
²Alibaba Group
³Chongqing University of Posts and Telecommunications
⁴Nanyang Technological University
^✉Corresponding author. ^✝Project leader.

We present FE2E, a DiT-based foundation model for monocular dense geometry prediction. FE2E adapts an advanced image editing model to dense geometry tasks and achieves strong zero-shot performance on both monocular depth and normal estimation.

📢 News

[2026-03-17]: Code and Checkpoint are available now!
[2026-02-21]: FE2E was accepted by CVPR 2026!!! 🎉🎉🎉
[2025-09-05]: Paper released on arXiv.

🛠️ Setup

This codebase is prepared as an inference/evaluation release.

pip install -r requirements.txt

Recommended local layout:

FE2E/
├── pretrain/
│   ├── step1x-edit-i1258.safetensors
│   ├── step1x-edit-v1p1-official.safetensors
│   └── vae.safetensors
├── lora/
│   └── LDRN.safetensors
├── infer/
│   ├── eth3d/
│   │   └── eth3d.tar
│   └── dsine_eval/
│       ├── nyuv2/
│       └── scannet/
└── logs/

🔥 Training

[ ] Training code will be released later.

🕹️ Inference

1. Prepare Model Weights

Download the base weights, which from the official Step1X-Edit release.
Download FE2E LoRA checkpoint

2. Prepare Benchmark Datasets

Depth benchmarks follow the external evaluation data convention from Marigold.
Normal benchmarks follow the external evaluation data convention from DSINE.

Supported depth benchmarks:

nyu_v2,kitti,eth3d,diode,scannet

Supported normal benchmarks:

nyuv2,scannet,ibims,sintel

3. Run Evaluation

[dataset] normal:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MASTER_PORT=21258 \
PYTHONUNBUFFERED=1 \
python -u evaluation.py \
  --model_path ./pretrain \
  --eval_data_root ./infer \
  --output_dir ./infer/eval_verify_scannet_normal_8gpu \
  --num_gpus 8 \
  --num_samples -1 \
  --lora ./lora/LDRN.safetensors \
  --single_denoise \
  --prompt_type empty \
  --norm_type ln \
  --task_name normal \
  --normal_eval_datasets [dataset]

[dataset] depth:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MASTER_PORT=21257 \
PYTHONUNBUFFERED=1 \
python -u evaluation.py \
  --model_path ./pretrain \
  --eval_data_root ./infer \
  --output_dir ./infer/eval_verify_eth3d_8gpu \
  --num_gpus 8 \
  --num_samples -1 \
  --lora ./lora/LDRN.safetensors \
  --single_denoise \
  --prompt_type empty \
  --norm_type ln \
  --task_name depth \
  --depth_eval_datasets [dataset]

4. Reference Logs

If you want to known the successful status, this repo includes run logs in logs/:

logs/verify_scannet_normal_8gpu_20260317_171345.log
logs/verify_eth3d_8gpu_20260317_172004.log

🎓 Citation

If you find our work useful, please cite:

@article{wang2025editor,
  title={From Editor to Dense Geometry Estimator},
  author={Wang, JiYuan and Lin, Chunyu and Sun, Lei and Liu, Rongying and Nie, Lang and Li, Mingxing and Liao, Kang and Chu, Xiangxiang and Zhao, Yao},
  journal={arXiv preprint arXiv:2509.04338},
  year={2025}
}