Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

.gitattributes +1 -0
README.md +143 -3
assets/teaser.png +3 -0
lingbot-map.pt +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+assets/teaser.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,143 @@
----
-license: apache-2.0
----

+<h1 align="center">LingBot-Map: Geometric Context Transformer for Streaming 3D Reconstruction</h1>
+<p align="center">
+  <a href="lingbot-map_paper.pdf"><img src="https://img.shields.io/static/v1?label=Paper&message=PDF&color=red&logo=arxiv"></a>
+  <a href="https://technology.robbyant.com/lingbot-map"><img src="https://img.shields.io/badge/Project-Website-blue"></a>
+  <a href=""><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Model&message=HuggingFace&color=orange"></a>
+  <a href="LICENSE.txt"><img src="https://img.shields.io/badge/License-Apache--2.0-green"></a>
+</p>
+<p align="center">
+  <img src="assets/teaser.png" width="100%">
+</p>
+<p align="center">
+  <video src="https://gw.alipayobjects.com/v/huamei_vaouhm/afts/video/q0sdTr9Mm6IAAAAAmyAAAAgADglFAQJr" width="100%" autoplay loop muted playsinline></video>
+</p>
+---
+# Quick Start
+## Installation
+**1. Create conda environment**
+```bash
+conda create -n lingbot-map python=3.10 -y
+conda activate lingbot-map
+```
+**2. Install PyTorch (CUDA 12.8)**
+```bash
+pip install torch==2.9.1 torchvision==0.24.1 --index-url https://download.pytorch.org/whl/cu128
+```
+> For other CUDA versions, see [PyTorch Get Started](https://pytorch.org/get-started/locally/).
+**3. Install lingbot-map**
+```bash
+pip install -e .
+```
+**4. Install FlashInfer (recommended)**
+FlashInfer provides paged KV cache attention for efficient streaming inference:
+```bash
+# CUDA 12.8 + PyTorch 2.9
+pip install flashinfer-python -i https://flashinfer.ai/whl/cu128/torch2.9/
+```
+> For other CUDA/PyTorch combinations, see [FlashInfer installation](https://docs.flashinfer.ai/installation.html).
+> If FlashInfer is not installed, the model falls back to SDPA (PyTorch native attention) via `--use_sdpa`.
+**5. Visualization dependencies (optional)**
+```bash
+pip install -e ".[vis]"
+```
+# Demo
+## Streaming Inference from Images
+```bash
+python demo.py --model_path /path/to/checkpoint.pt \
+    --image_folder /path/to/images/
+```
+## Streaming Inference from Video
+```bash
+python demo.py --model_path /path/to/checkpoint.pt \
+    --video_path video.mp4 --fps 10
+```
+## Streaming with Keyframe Interval
+Use `--keyframe_interval` to reduce KV cache memory by only keeping every N-th frame as a keyframe. Non-keyframe frames still produce predictions but are not stored in the cache. This is useful for long sequences
+which excesses 320 frames.
+```bash
+python demo.py --model_path /path/to/checkpoint.pt \
+    --image_folder /path/to/images/ --keyframe_interval 6
+```
+## Windowed Inference (for long sequences, >3000 frames)
+```bash
+python demo.py --model_path /path/to/checkpoint.pt \
+    --video_path video.mp4 --fps 10 \
+    --mode windowed --window_size 64
+```
+## With Sky Masking
+```bash
+python demo.py --model_path /path/to/checkpoint.pt \
+    --image_folder /path/to/images/ --mask_sky
+```
+## Without FlashInfer (SDPA fallback)
+```bash
+python demo.py --model_path /path/to/checkpoint.pt \
+    --image_folder /path/to/images/ --use_sdpa
+```
+# Model Download
+<!-- TODO: fill in model checkpoints -->
+| Model Name | Huggingface Repository | Description |
+| :--- | :--- | :--- |
+| lingbot-map | | Base model checkpoint |
+# License
+This project is released under the Apache License 2.0. See [LICENSE](LICENSE.txt) file for details.
+# Citation
+```bibtex
+@article{lingbot-map2026,
+  title={},
+  author={},
+  journal={arXiv preprint arXiv:},
+  year={2026}
+}
+```
+# Acknowledgments
+This work builds upon several excellent open-source projects:
+- [VGGT](https://github.com/facebookresearch/vggt)
+- [DINOv2](https://github.com/facebookresearch/dinov2)
+- [Flashinfer](https://github.com/flashinfer-ai/flashinfer)
+---

assets/teaser.png ADDED Viewed

Git LFS Details

SHA256: d34377bdb2f0747442f3113692914e669e97cb1d474578711cc30d08c5618bcc
Pointer size: 132 Bytes
Size of remote file: 5.11 MB

lingbot-map.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:986579f63db7bde3cb0f0ecc0a8fd49f5e4b6141a178ac33598d7fbe3e901cd0
+size 4632326476