| --- |
| language: |
| - en |
| license: cc-by-nc-sa-4.0 |
| pipeline_tag: other |
| --- |
| |
| # MYRIAD (Envisioning the Future, One Step at a Time) |
|
|
| [](https://compvis.github.io/myriad) |
| [](https://arxiv.org/abs/2604.09527) |
| [](https://huggingface.co/papers/2604.09527) |
| [](https://github.com/CompVis/flow-poke-transformer) |
| [](https://huggingface.co/datasets/CompVis/owm-95) |
| [](https://huggingface.co/datasets/CompVis/myriad-physics) |
|
|
|
|
| MYRIAD (Motion hYpothesis Reasoning via Iterative Autoregressive Diffusion) is an autoregressive diffusion model that predicts open-set future scene dynamics as step-wise inference over sparse point trajectories. Starting from a single image, it can efficiently explore thousands of plausible future outcomes, maintaining physical plausibility. |
|
|
| ## Paper and Abstract |
|
|
| The MYRIAD model was presented in the paper [Envisioning the Future, One Step at a Time](https://arxiv.org/abs/2604.09527). |
|
|
| From a single image, MYRIAD predicts distributions over sparse point trajectories autoregressively. This allows the model to predict consistent futures in open-set environments and plan actions by exploring a large number of counterfactual interactions. |
|
|
|  |
| *From a single image, our model envisions diverse, physically consistent futures by predicting sparse point trajectories step by step.* |
|
|
|  |
| *Its efficiency enables exploring thousands of counterfactual rollouts directly in motion space - here illustrated for billiards planning, where candidate shots are evaluated by simulating many possible outcomes.* |
|
|
| ## Usage |
|
|
| For programmatic use, the simplest way to use MYRIAD is via `torch.hub`: |
|
|
| ```python |
| import torch |
| |
| # Load the open-set model |
| myriad_openset = torch.hub.load("CompVis/myriad", "myriad_openset") |
| |
| # Load the billiard-specific model |
| myriad_billiard = torch.hub.load("CompVis/myriad", "myriad_billiard") |
| ``` |
|
|
| If you wish to integrate MYRIAD in your own codebase, you can copy `model.py` and `dinov3.py` from the [GitHub repository](https://github.com/CompVis/flow-poke-transformer). |
| The `MyriadStepByStep` class contains a `predict_simulate` method for unrolling trajectories and a low-level `forward` method to predict distributions for previously observed trajectories. |
|
|
| ## Citation |
|
|
| If you find our model or code useful, please cite our paper: |
|
|
| ```bibtex |
| @inproceedings{baumann2026envisioning, |
| title={Envisioning the Future, One Step at a Time}, |
| author={Baumann, Stefan Andreas and Wiese, Jannik and Martorella, Tommaso and Kalayeh, Mahdi M. and Ommer, Bjorn}, |
| booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, |
| year={2026} |
| } |
| ``` |