Improve model card (#1)

- Improve model card (965d4ba5c97956b94e33dbfe4f71241e4c6bb0e9)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +37 -4

README.md CHANGED Viewed

@@ -1,26 +1,38 @@
 ---
 license: mit
 tags:
-- depth-estimation
 - video-depth
 - monocular-geometry
 - streaming
 ---
-# DyFN
-Pretrained checkpoint for **Stabilizing Streaming Video Geometry via Dynamic Feature Normalization**.
 - **File:** `DyFN.pt`
 - **Parameters:** ~320M
 - **Base:** MoGe-ViT-L with ConvGRU temporal stabilizer
-- **Code:** [shawLyu/Streaming_DyFN](https://github.com/shawLyu/Streaming_DyFN)
 ## Usage
 ```python
 from moge.model.v1 import MoGeModel
 model = MoGeModel.from_pretrained("shawlyu/DyFN")
 ```
@@ -29,3 +41,24 @@ Or pass a local path:
 ```python
 model = MoGeModel.from_pretrained("./pretrained/DyFN.pt")
 ```

 ---
 license: mit
+pipeline_tag: depth-estimation
 tags:
 - video-depth
 - monocular-geometry
 - streaming
 ---
+# DyFN: Stabilizing Streaming Video Geometry via Dynamic Feature Normalization
+This repository contains the pretrained checkpoint for **DyFN**, a model designed for consistent 3D geometry estimation from streaming RGB input.
+[**Paper**](https://huggingface.co/papers/2605.25308) | [**Project Page**](https://shawlyu.github.io/DyFN) | [**Code**](https://github.com/shawLyu/Streaming_DyFN)
+## Description
+Dynamic Feature Normalization (DyFN) is a lightweight, causal recurrent module that dynamically and robustly modulates feature statistics to maintain stable geometry over time. By finetuning only DyFN (a mere 2% additional parameters) on pretrained monocular geometry models, it effectively eliminates temporal artifacts such as disjointed layering and positional jitter without compromising single-image accuracy.
 - **File:** `DyFN.pt`
 - **Parameters:** ~320M
 - **Base:** MoGe-ViT-L with ConvGRU temporal stabilizer
 ## Usage
+To use this model, you can install the package via:
+```bash
+pip install git+https://github.com/shawLyu/Streaming_DyFN.git
+```
+Then, load the model with the following snippet:
 ```python
 from moge.model.v1 import MoGeModel
+# Load from Hugging Face Hub
 model = MoGeModel.from_pretrained("shawlyu/DyFN")
 ```
 ```python
 model = MoGeModel.from_pretrained("./pretrained/DyFN.pt")
 ```
+## Citation
+If you find this project useful in your research, please cite:
+```bibtex
+@inproceedings{lyu2026streamingdepth,
+  title={Stabilizing Streaming Video Geometry via Dynamic Feature Normalization},
+  author={Lyu, Xiaoyang and Liu, Muxin and Wu, Xiaoshan and Wang, Ruicheng and Huang, Yi-Hua and Sun, Yang-Tian and Shi, Shaoshuai and Qi, Xiaojuan},
+  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+  year={2026}
+}
+@inproceedings{wang2025moge,
+  title={Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision},
+  author={Wang, Ruicheng and Xu, Sicheng and Dai, Cassie and Xiang, Jianfeng and Deng, Yu and Tong, Xin and Yang, Jiaolong},
+  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
+  pages={5261--5271},
+  year={2025}
+}
+```