Pi0.5 SnapFlow Distilled Student (1-NFE)
First public reproduction of SnapFlow distillation (arxiv 2604.05656) for Vision-Language-Action models.
A 1-NFE (single-denoising-step) student distilled from lerobot/pi05_libero_finetuned_v044
using SnapFlow self-distillation. Runs ~10× faster per inference than the teacher (which
requires 10 denoising steps) while matching or beating teacher task-success on LIBERO.
Headline result
| Metric | Student (this model, 1-NFE) | Teacher (pi0.5, 10-NFE) |
|---|---|---|
| LIBERO 5-task @ N=30 | 29/30 = 96.7% | 28/30 = 93.3% |
| Inference steps per /act chunk | 1 | 10 |
| Training cost | ~$25 Modal | n/a (pretrained) |
Net: +3.4 percentage points over teacher at ~10× fewer denoising steps.
How it was distilled
SnapFlow self-distillation:
- Student initialized from teacher weights
- Equation 11 consistency loss enforces self-consistency across the velocity field
- 10k training steps on Modal A100-80GB, batch=4, bf16
- No reward signal; pure self-distillation from the teacher's flow
Full reproduction via Reflex VLA:
pip install reflex-vla
reflex distill \
--teacher lerobot/pi05_libero_finetuned_v044 \
--steps 10000 \
--batch 4
Inference
This checkpoint is the LeRobot PyTorch policy. For deployment, use Reflex's decomposed pi0.5 runtime which runs the VLM prefix once per chunk + the expert denoise N times — measured 9.79× theoretical speedup over monolithic ONNX (89.8% VLM compute share, 2026-04-23 microbench on A100-80GB).
reflex export <local-checkpoint> --decomposed
reflex serve <export-dir>
Latency on A100-80GB (measured 2026-04-23):
- Full forward: 98ms / chunk
- Cache hit (expert only): 10ms / chunk
- Theoretical speedup: 9.79× (validated)
Fits Jetson Orin Nano 8 GB at FP16 (6.5 GB after FP16 conversion; teacher monolithic FP32 is 12.99 GB and doesn't fit).
Files
model.safetensors— student policy weightsconfig.json— LeRobot policy config (PI05Policy)distill_provenance.json— training provenance (seeds, dataset, steps, base model)
License
Apache 2.0. Inherits from lerobot/pi05_libero_finetuned_v044. SnapFlow algorithm is
the work of the paper authors (arxiv 2604.05656).
Citation
@article{snapflow2026,
title={SnapFlow: Self-Distillation for Flow-Matching Vision-Language-Action Models},
journal={arXiv preprint arXiv:2604.05656},
year={2026}
}
If you use this checkpoint or the Reflex deployment toolchain, please cite both the SnapFlow paper and link to https://github.com/rylinjames/reflex-vla.
Reflex
Reflex VLA is the open-source deployment toolchain that produced this checkpoint and runs it ~9× faster on cheap edge hardware via decomposed VLM/expert ONNX export. Cross-family support: pi0, pi0.5, SmolVLA, GR00T, OpenVLA in one binary.
- Downloads last month
- 14
Model tree for Rylinjames/pi05-snapflow-distill-1nfe
Base model
lerobot/pi05_libero_finetuned_v044