zeyuren2002

Add files using upload-large-folder tool

40a3ea8 verified 9 days ago

5.18 kB

	# FE2E: From Editor to Dense Geometry Estimator

	[![Page](https://img.shields.io/badge/Project-Website-pink?logo=googlechrome&logoColor=white)](https://amap-ml.github.io/FE2E/)
	[![Paper](https://img.shields.io/badge/arXiv-2509.04338-b31b1b?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2509.04338)
	[![GitHub](https://img.shields.io/github/stars/AMAP-ML/FE2E?style=social)](https://github.com/AMAP-ML/FE2E)
	[![HuggingFace](https://img.shields.io/badge/🤗%20HuggingFace-Model-yellow)](https://huggingface.co/exander/FE2E)
	[![Video](https://img.shields.io/badge/BiliBili-Video-00A1D6)](https://www.bilibili.com/video/BV1zYXdBXE2x)
	[![Video](https://img.shields.io/badge/YouTube-Video-red)](https://youtu.be/fyXwwH_-o5w)

	[Jiyuan Wang](https://wangjiyuan9.github.io/)<sup>1,2</sup>,
	[Chunyu Lin](https://scholar.google.com/citations?hl=zh-CN&user=t8xkhscAAAAJ)<sup>1✉</sup>,
	[Lei Sun](https://scholar.google.com/citations?user=your-id)<sup>2✝</sup>,
	[Rongying Liu](https://scholar.google.com/citations?user=your-id)<sup>1</sup>,
	[Mingxing Li](https://scholar.google.com/citations?user=-pfkprkAAAAJ&hl=zh-CN&oi=ao)<sup>2</sup>,
	[Lang Nie](https://scholar.google.com/citations?hl=zh-CN&user=vo__egkAAAAJ)<sup>3</sup>,
	[Kang Liao](https://kangliao929.github.io/)<sup>4</sup>,
	[Xiangxiang Chu](https://cxxgtxy.github.io/)<sup>2</sup>,
	[Yao Zhao](https://faculty.bjtu.edu.cn/5900/)<sup>1</sup>

	<span class="author-block"><sup>1</sup>Beijing Jiaotong University</span>
	<span class="author-block"><sup>2</sup>Alibaba Group</span>
	<span class="author-block"><sup>3</sup>Chongqing University of Posts and Telecommunications</span>
	<span class="author-block"><sup>4</sup>Nanyang Technological University</span>
	<span class="author-block"><sup>✉</sup>Corresponding author. <sup>✝</sup>Project leader.</span>

	![teaser](assets/demo.png)

	We present FE2E, a DiT-based foundation model for monocular dense geometry prediction. FE2E adapts an advanced image editing model to dense geometry tasks and achieves strong zero-shot performance on both monocular depth and normal estimation.

	![pipeline](assets/pipeline.png)

	## 📢 News
	- [2026-03-17]: Code and Checkpoint are available now!
	- [2026-02-21]: FE2E was accepted by CVPR 2026!!! 🎉🎉🎉
	- [2025-09-05]: Paper released on [arXiv](https://arxiv.org/abs/2509.04338).

	---

	## 🛠️ Setup

	This codebase is prepared as an inference/evaluation release.

	```bash
	pip install -r requirements.txt
	```

	Recommended local layout:

	```text
	FE2E/
	├── pretrain/
	│ ├── step1x-edit-i1258.safetensors
	│ ├── step1x-edit-v1p1-official.safetensors
	│ └── vae.safetensors
	├── lora/
	│ └── LDRN.safetensors
	├── infer/
	│ ├── eth3d/
	│ │ └── eth3d.tar
	│ └── dsine_eval/
	│ ├── nyuv2/
	│ └── scannet/
	└── logs/
	```

	---

	## 🔥 Training

	```text
	[ ] Training code will be released later.
	```

	---

	## 🕹️ Inference

	### 1. Prepare Model Weights

	1. Download the base weights, which from the official [Step1X-Edit](https://github.com/stepfun-ai/Step1X-Edit) release.
	2. Download FE2E LoRA [checkpoint](https://huggingface.co/exander/FE2E/blob/main/LDRN.safetensors)


	### 2. Prepare Benchmark Datasets

	- Depth benchmarks follow the external evaluation data convention from [Marigold](https://github.com/prs-eth/Marigold).
	- Normal benchmarks follow the external evaluation data convention from [DSINE](https://github.com/baegwangbin/DSINE).


	Supported depth benchmarks:
	- `nyu_v2`,`kitti`,`eth3d`,`diode`,`scannet`

	Supported normal benchmarks:
	- `nyuv2`,`scannet`,`ibims`,`sintel`


	### 3. Run Evaluation

	`[dataset] normal`:

	```bash
	CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
	MASTER_PORT=21258 \
	PYTHONUNBUFFERED=1 \
	python -u evaluation.py \
	--model_path ./pretrain \
	--eval_data_root ./infer \
	--output_dir ./infer/eval_verify_scannet_normal_8gpu \
	--num_gpus 8 \
	--num_samples -1 \
	--lora ./lora/LDRN.safetensors \
	--single_denoise \
	--prompt_type empty \
	--norm_type ln \
	--task_name normal \
	--normal_eval_datasets [dataset]
	```

	`[dataset] depth`:

	```bash
	CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
	MASTER_PORT=21257 \
	PYTHONUNBUFFERED=1 \
	python -u evaluation.py \
	--model_path ./pretrain \
	--eval_data_root ./infer \
	--output_dir ./infer/eval_verify_eth3d_8gpu \
	--num_gpus 8 \
	--num_samples -1 \
	--lora ./lora/LDRN.safetensors \
	--single_denoise \
	--prompt_type empty \
	--norm_type ln \
	--task_name depth \
	--depth_eval_datasets [dataset]
	```


	### 4. Reference Logs
	If you want to known the successful status, this repo includes run logs in `logs/`:
	- `logs/verify_scannet_normal_8gpu_20260317_171345.log`
	- `logs/verify_eth3d_8gpu_20260317_172004.log`


	---

	## 🎓 Citation

	If you find our work useful, please cite:

	```bibtex
	@article{wang2025editor,
	title={From Editor to Dense Geometry Estimator},
	author={Wang, JiYuan and Lin, Chunyu and Sun, Lei and Liu, Rongying and Nie, Lang and Li, Mingxing and Liao, Kang and Chu, Xiangxiang and Zhao, Yao},
	journal={arXiv preprint arXiv:2509.04338},
	year={2025}
	}
	```