kliyer commited on
Commit
22c2224
·
verified ·
1 Parent(s): c6ab401

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -3
README.md CHANGED
@@ -1,3 +1,87 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: cc-by-nc-sa-4.0
5
+ pipeline_tag: other
6
+ tags:
7
+ - motion-generation
8
+ - trajectory-prediction
9
+ - robotics
10
+ - computer-vision
11
+ - pytorch
12
+ - torch-hub
13
+ ---
14
+
15
+ # ZipMo (Learning Long-term Motion Embeddings for Efficient Kinematics Generation)
16
+
17
+ [![Project Page](https://img.shields.io/badge/Project-Page-blue)](https://compvis.github.io/long-term-motion)
18
+ [![Paper](https://img.shields.io/badge/arXiv-coming_soon-b31b1b)](https://arxiv.org/)
19
+ [![GitHub](https://img.shields.io/badge/GitHub-Code-black)](https://github.com/CompVis/long-term-motion)
20
+ [![Venue](https://img.shields.io/badge/CVPR-2026-green)](https://compvis.github.io/long-term-motion)
21
+
22
+ ZipMo is a motion-space model for efficient long-horizon kinematics generation. It learns compact long-term motion embeddings from large-scale tracker-derived trajectories and generates plausible future motion directly in this learned motion space. The model supports spatial-poke conditioning for open-domain videos and task/text-embedding conditioning for LIBERO robotics evaluation.
23
+
24
+ ## Paper and Abstract
25
+
26
+ ZipMo was introduced in the CVPR 2026 paper **Learning Long-term Motion Embeddings for Efficient Kinematics Generation**.
27
+
28
+ Understanding and predicting motion is a fundamental component of visual intelligence. Although video models can synthesize scene dynamics, exploring many possible futures through full video generation is expensive. ZipMo instead operates directly on long-term motion embeddings learned from tracker trajectories, enabling efficient generation of long, realistic motions while preserving dense reconstruction at arbitrary spatial query points.
29
+
30
+ ![ZipMo teaser figure](https://compvis.github.io/long-term-motion/static/images/social_preview.png)
31
+ *ZipMo generates long-horizon motion in a compact learned motion space, supporting spatial-poke conditioning for open-domain videos and task-conditioned action prediction on LIBERO.*
32
+
33
+ ## Usage
34
+
35
+ For programmatic use, the simplest way to use ZipMo is via `torch.hub`:
36
+
37
+ ```python
38
+ import torch
39
+
40
+ repo = "CompVis/long-term-motion"
41
+
42
+ # Open-domain motion prediction
43
+ planner_sparse = torch.hub.load(repo, "zipmo_planner_sparse")
44
+ planner_dense = torch.hub.load(repo, "zipmo_planner_dense")
45
+
46
+ # Motion autoencoder
47
+ vae = torch.hub.load(repo, "zipmo_vae")
48
+ ```
49
+
50
+ LIBERO planning and policy components can be loaded in the same way:
51
+
52
+ ```python
53
+ import torch
54
+
55
+ repo = "CompVis/long-term-motion"
56
+
57
+ # LIBERO planners
58
+ libero_atm_planner = torch.hub.load(repo, "zipmo_planner_libero", "atm")
59
+ libero_tramoe_planner = torch.hub.load(repo, "zipmo_planner_libero", "tramoe")
60
+
61
+ # LIBERO policy heads
62
+ policy_head_atm = torch.hub.load(repo, "zipmo_policy_head", "atm")
63
+ policy_head_tramoe_goal = torch.hub.load(repo, "zipmo_policy_head", "tramoe", "goal")
64
+ ```
65
+
66
+ Available Torch Hub entries:
67
+
68
+ - `zipmo_planner_sparse`: sparse-poke planner for open-domain motion prediction.
69
+ - `zipmo_planner_dense`: dense-conditioning planner for open-domain motion prediction.
70
+ - `zipmo_vae`: long-term motion autoencoder.
71
+ - `zipmo_planner_libero`: LIBERO planner with mode `atm` or `tramoe`.
72
+ - `zipmo_policy_head`: LIBERO policy head with mode `atm` or `tramoe`. For `tramoe`, pass one of `10`, `goal`, `object`, or `spatial`.
73
+
74
+ For the interactive demo, standard track prediction evaluation, LIBERO rollout evaluation, and training instructions, see the [GitHub repository](https://github.com/CompVis/long-term-motion).
75
+
76
+ ## Citation
77
+
78
+ If you find our model or code useful, please cite our paper:
79
+
80
+ ```bibtex
81
+ @inproceedings{stracke2026motionembeddings,
82
+ title = {Learning Long-term Motion Embeddings for Efficient Kinematics Generation},
83
+ author = {Stracke, Nick and Bauer, Kolja and Baumann, Stefan Andreas and Bautista, Miguel Angel and Susskind, Josh and Ommer, Bj{\"o}rn},
84
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
85
+ year = {2026}
86
+ }
87
+ ```