YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

DriveLaW: Unifying Planning and Video Generation in a Latent Driving World

CVPR 2026

Tianze Xia^1,2*, Yongkang Li^1,2*, Lijun Zhou^2*, Jingfeng Yao¹, Kaixin Xiong², Haiyang Sun^2†, Bing Wang²,
Kun Ma², Guang Chen², Hangjun Ye², Wenyu Liu¹, Xinggang Wang^1✉

¹ Huazhong University of Science and Technology ² Xiaomi EV

(*) Equal contribution. (†) Project leader. (✉)Corresponding Author.

News

[2026/2/21] Our paper has been accepted at CVPR 2026. 🎉

[2025/12/30] ArXiv paper release. Models/Code are coming soon. Please stay tuned! ☕️

Abstract

World models have become crucial for autonomous driving, as they learn how scenarios evolve over time to address the long-tail challenges of the real world. However, current approaches relegate world models to limited roles: they operate within ostensibly unified architectures that still keep world prediction and motion planning as decoupled processes. To bridge this gap, we propose DriveLaW, a novel paradigm that unifies video generation and motion planning. By directly injecting the latent representation from its video generator into the planner, DriveLaW ensures inherent consistency between high-fidelity future generation and reliable trajectory planning. Specifically, DriveLaW consists of two core components: DriveLaW-Video, our powerful world model that generates high-fidelity forecasting with expressive latent representations, and DriveLaW-Act, a diffusion planner that generates consistent and reliable trajectories from the latent of DriveLaW-Video, with both components optimized by a three-stage progressive training strategy. The power of our unified paradigm is demonstrated by new state-of-the-art results across both tasks. DriveLaW not only advances video prediction significantly, surpassing best-performing work by 33.3% in FID and 1.8% in FVD, but also achieves a new record on the NAVSIM planning benchmark.

Contact

If you have any questions, please contact Tianze Xia via email (xiatianze@hust.edu.cn).

Acknowledgments

DriveLaW is inspired by the following outstanding contributions to the open-source community: NAVSIM, LTX-Video , ReCogDrive, Diffusers, Genie Envisioner, Epona.

Citation

If you find DriveLaW is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{xia2025drivelaw,
  title={DriveLaW: Unifying Planning and Video Generation in a Latent Driving World},
  author={Xia, Tianze and Li, Yongkang and Zhou, Lijun and Yao, Jingfeng and Xiong, Kaixin and Sun, Haiyang and Wang, Bing and Ma, Kun and Ye, Hangjun and Liu, Wenyu and others},
  journal={arXiv preprint arXiv:2512.23421},
  year={2025}
}

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for tz2026/DriveLaW

DriveLaW:Unifying Planning and Video Generation in a Latent Driving World

Paper • 2512.23421 • Published Dec 29, 2025