Mimic Intent, Not Just Trajectories
Paper • 2602.08602 • Published • 13
MINT (Mimic Intent, Not just Trajectories) is a framework for end-to-end imitation learning in dexterous manipulation. It explicitly disentangles behavior intent from execution details by learning a hierarchical, multi-scale token representation of actions.
While imitation learning has achieved success in robotic manipulation, models often struggle with adaptation and transfer because they mimic raw trajectories. MINT addresses this by disentangling behavior intent from execution details through multi-scale frequency-space tokenization.
This model is designed to be used with the LeRobot library.
To evaluate the policy in the LIBERO environment, use the following command (requires the MINT-tokenizer-libero):
lerobot-eval \
--policy.path=huangrm/MINT-libero \
--policy.vqvae_name_or_path=huangrm/MINT-tokenizer-libero \
--env.type=libero \
--env.task=libero_10,libero_object,libero_spatial,libero_goal \
--eval.batch_size=1 \
--eval.n_episodes=2 \
--seed=42 \
--policy.n_action_steps=4
@article{huang2026mimic,
title={Mimic Intent, Not Just Trajectories},
author={Huang, Renming and Zeng, Chendong and Tang, Wenjing and Cai, Jintian and Lu, Cewu and Cai, Panpan},
journal={arXiv preprint arXiv:2602.08602},
year={2026}
}