Add model card and robotics pipeline tag (#1)
Browse files- Add model card and robotics pipeline tag (3ee2284a831d703ce7060812813195606ea3f126)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,3 +1,34 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
pipeline_tag: robotics
|
| 4 |
---
|
| 5 |
+
|
| 6 |
+
# FrameSkip: Learning from Fewer but More Informative Frames in VLA Training
|
| 7 |
+
|
| 8 |
+
[**Paper**](https://huggingface.co/papers/2605.13757) | [**Code**](https://github.com/ZGC-EmbodyAI/FrameSkip) | [**Project Collection**](https://huggingface.co/collections/VLyb/frameskip)
|
| 9 |
+
|
| 10 |
+
**FrameSkip** is a training-time frame selection framework for Vision-Language-Action (VLA) models. Instead of treating every frame in a dense robot demonstration trajectory as equally useful supervision, FrameSkip scores trajectory frames with lightweight cues and trains primarily from fewer but more informative frames.
|
| 11 |
+
|
| 12 |
+
FrameSkip is designed as a data-layer intervention: it changes which frames are exposed during training while leaving the VLA architecture, action head, loss function, and inference procedure unchanged.
|
| 13 |
+
|
| 14 |
+
## Highlights
|
| 15 |
+
- **Efficiency:** Achieves higher success rates while training on a compressed view of trajectories (retaining ~20% of frames).
|
| 16 |
+
- **Architecture-agnostic:** Operates entirely in the dataloader, making it compatible with various VLA architectures.
|
| 17 |
+
- **Importance-guided:** Uses action variation, visual-action coherence, task-progress priors, and gripper-transition preservation to score frames.
|
| 18 |
+
|
| 19 |
+
## Usage
|
| 20 |
+
FrameSkip is built on the [starVLA](https://github.com/starVLA/starVLA) training and evaluation stack. The released checkpoints follow the standard starVLA checkpoint format and can be loaded in the same way as starVLA VLA policies.
|
| 21 |
+
|
| 22 |
+
For simulation evaluation, please refer to the model loading and evaluation workflow of the **QwenGR00T** architecture in starVLA, and replace the checkpoint path with the downloaded FrameSkip checkpoint.
|
| 23 |
+
|
| 24 |
+
## Citation
|
| 25 |
+
If you find FrameSkip useful, please cite the following work:
|
| 26 |
+
|
| 27 |
+
```bibtex
|
| 28 |
+
@article{yu2025frameskip,
|
| 29 |
+
title={FrameSkip: Learning from Fewer but More Informative Frames in VLA Training},
|
| 30 |
+
author={Yu, Bin and Lian, Shijie and Lin, Xiaopeng and Shen, Zhaolong and Wei, Yuliang and Wu, Changti and Yuan, Hang and Liu, Haishan and Wang, Bailing and Huang, Cong and Chen, Kai},
|
| 31 |
+
journal={arXiv preprint arXiv:2605.13757},
|
| 32 |
+
year={2025}
|
| 33 |
+
}
|
| 34 |
+
```
|