NicolasCC commited on
Commit
d66b1a1
·
verified ·
1 Parent(s): dd64d2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -5
README.md CHANGED
@@ -1,3 +1,4 @@
 
1
  license: apache-2.0
2
  language:
3
  - en
@@ -18,6 +19,18 @@ arxiv: 2605.23889
18
 
19
  [[Paper](https://arxiv.org/abs/2605.23889)] [[Code](https://github.com/3dagentworld/horizonstream)] [[Project Page](https://3dagentworld.github.io/horizonstream/)]
20
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Quick Start
22
 
23
  ```bash
@@ -32,10 +45,14 @@ python infer.py \
32
  ## Citation
33
 
34
  ```bibtex
35
- @article{cheng2026horizonstream,
36
- title={HorizonStream: Long-Horizon Attention for Streaming 3D Reconstruction},
37
- author={Chong Cheng and Peilin Tao and Nanjie Yao and Guanzhi Ding and Xianda Chen and Yuansen Du and Xiaoyang Guo and Wei Yin and Weiqiang Ren and Qian Zhang and Zhengqing Chen and Hao Wang},
38
- journal={arXiv preprint arXiv:2605.23889},
39
- year={2026}
 
 
 
40
  }
41
  ```
 
 
1
+ ---
2
  license: apache-2.0
3
  language:
4
  - en
 
19
 
20
  [[Paper](https://arxiv.org/abs/2605.23889)] [[Code](https://github.com/3dagentworld/horizonstream)] [[Project Page](https://3dagentworld.github.io/horizonstream/)]
21
 
22
+ ## Overview
23
+
24
+ Online 3D reconstruction requires estimating camera pose and scene geometry under strict causal and bounded-memory constraints. Existing methods suffer from drift, jitter, or collapse on long sequences — a symptom of a fundamental mismatch: streaming geometry is temporally heterogeneous (evidence spans short-lived correspondences to persistent global scale), yet most architectures impose uniform influence patterns such as hard sliding-window cutoffs or ungated recurrence that causes cache saturation and attention sinks.
25
+
26
+ **HorizonStream** resolves this by formalizing geometric propagation as an *evidence influence kernel* and explicitly factorizing it:
27
+
28
+ - **Geometric Linear Attention** — learns per-channel decay rates for bounded, multi-timescale propagation of geometric evidence across the full history.
29
+ - **Geometric Local Attention with Spatiotemporal RoPE** — performs reliable short-range 3D matching while suppressing attention sinks.
30
+ - **Metric Readout Tokens** — recover stable scale and rigid pose directly from the persistent geometric state.
31
+
32
+ Trained on only 48-frame clips, HorizonStream generalizes stably to sequences exceeding **10,000 frames** with **constant memory** and **linear time**, achieving state-of-the-art streaming 3D reconstruction on KITTI, Waymo, and VBR benchmarks. It outputs camera poses, depth maps, videos, and point clouds from raw image sequences or videos.
33
+
34
  ## Quick Start
35
 
36
  ```bash
 
45
  ## Citation
46
 
47
  ```bibtex
48
+ @misc{cheng2026horizonstreamlonghorizonattentionstreaming,
49
+ title={HorizonStream: Long-Horizon Attention for Streaming 3D Reconstruction},
50
+ author={Chong Cheng and Peilin Tao and Nanjie Yao and Guanzhi Ding and Xianda Chen and Yuansen Du and Xiaoyang Guo and Wei Yin and Weiqiang Ren and Qian Zhang and Zhengqing Chen and Hao Wang},
51
+ year={2026},
52
+ eprint={2605.23889},
53
+ archivePrefix={arXiv},
54
+ primaryClass={cs.CV},
55
+ url={https://arxiv.org/abs/2605.23889},
56
  }
57
  ```
58
+