This model accompanies our work on Developing Vision-Language-Action Model from Egocentric Videos.


Citation

If you use this dataset, please cite:

@article{yoshida2025developing,
  title   = {Developing Vision-Language-Action Model from Egocentric Videos},
  author  = {Yoshida, Tomoya and Kurita, Shuhei and Nishimura, Taichi and Mori, Shinsuke},
  journal = {arXiv preprint arXiv:2509.21986},
  year    = {2025}
}
Downloads last month
12
Safetensors
Model size
4B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Biscue5/pi0-egoscaler-v2

Paper for Biscue5/pi0-egoscaler-v2