HorizonStream / README.md
NicolasCC's picture
Update README.md
d66b1a1 verified
metadata
license: apache-2.0
language:
  - en
tags:
  - 3d-reconstruction
  - depth-estimation
  - camera-pose-estimation
  - streaming
  - video-understanding
  - computer-vision
  - point-cloud
arxiv: 2605.23889

HorizonStream

HorizonStream: Long-Horizon Attention for Streaming 3D Reconstruction

[Paper] [Code] [Project Page]

Overview

Online 3D reconstruction requires estimating camera pose and scene geometry under strict causal and bounded-memory constraints. Existing methods suffer from drift, jitter, or collapse on long sequences — a symptom of a fundamental mismatch: streaming geometry is temporally heterogeneous (evidence spans short-lived correspondences to persistent global scale), yet most architectures impose uniform influence patterns such as hard sliding-window cutoffs or ungated recurrence that causes cache saturation and attention sinks.

HorizonStream resolves this by formalizing geometric propagation as an evidence influence kernel and explicitly factorizing it:

  • Geometric Linear Attention — learns per-channel decay rates for bounded, multi-timescale propagation of geometric evidence across the full history.
  • Geometric Local Attention with Spatiotemporal RoPE — performs reliable short-range 3D matching while suppressing attention sinks.
  • Metric Readout Tokens — recover stable scale and rigid pose directly from the persistent geometric state.

Trained on only 48-frame clips, HorizonStream generalizes stably to sequences exceeding 10,000 frames with constant memory and linear time, achieving state-of-the-art streaming 3D reconstruction on KITTI, Waymo, and VBR benchmarks. It outputs camera poses, depth maps, videos, and point clouds from raw image sequences or videos.

Quick Start

python infer.py \
  --config configs/horizonstream_infer.yaml \
  --video-path /path/to/input.mp4 \
  --hf-repo NicolasCC/HorizonStream \
  --hf-file HorizonStream.pt \
  --output-root outputs_horizonstream/input_video

Citation

@misc{cheng2026horizonstreamlonghorizonattentionstreaming,
      title={HorizonStream: Long-Horizon Attention for Streaming 3D Reconstruction}, 
      author={Chong Cheng and Peilin Tao and Nanjie Yao and Guanzhi Ding and Xianda Chen and Yuansen Du and Xiaoyang Guo and Wei Yin and Weiqiang Ren and Qian Zhang and Zhengqing Chen and Hao Wang},
      year={2026},
      eprint={2605.23889},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.23889}, 
}