--- license: apache-2.0 language: - en datasets: - nvidia/PhysicalAI-Robotics-GR00T-Teleop-Sim --- # DIAL Checkpoints

Project Page | Paper | Code

Model weights for **DIAL** (**D**ecoupling **I**ntent and **A**ction via **L**atent World Modeling), an end-to-end Vision-Language-Action (VLA) framework built on [NVIDIA Isaac GR00T N1.5](https://github.com/NVIDIA/Isaac-GR00T/tree/n1.5-release) with a [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) backbone. ## Available Checkpoints | Checkpoint | Training Data | Steps | Description | |---|---|---|---| | `DIAL-3B-fewshot` | EgoDex human data + 10% GR1 simulation data | 20K per stage (3-stage) | Co-trained with heterogeneous human demonstrations | | `DIAL-3B-fulldata` | All GR1 simulation data (~24,000 demos) | 40K per stage (2-stage) | Trained on full teleoperation trajectories in simulation | For installation, training, and evaluation instructions, please refer to the [GitHub repository](https://github.com/xpeng-robotics/DIAL).