Pi0.5 SnapFlow Distilled Student (1-NFE)

First public reproduction of SnapFlow distillation (arxiv 2604.05656) for Vision-Language-Action models.

A 1-NFE (single-denoising-step) student distilled from lerobot/pi05_libero_finetuned_v044 using SnapFlow self-distillation. Runs ~10× faster per inference than the teacher (which requires 10 denoising steps) while matching or beating teacher task-success on LIBERO.

Headline result

Metric Student (this model, 1-NFE) Teacher (pi0.5, 10-NFE)
LIBERO 5-task @ N=30 29/30 = 96.7% 28/30 = 93.3%
Inference steps per /act chunk 1 10
Training cost ~$25 Modal n/a (pretrained)

Net: +3.4 percentage points over teacher at ~10× fewer denoising steps.

How it was distilled

SnapFlow self-distillation:

  • Student initialized from teacher weights
  • Equation 11 consistency loss enforces self-consistency across the velocity field
  • 10k training steps on Modal A100-80GB, batch=4, bf16
  • No reward signal; pure self-distillation from the teacher's flow

Full reproduction via Reflex VLA:

pip install reflex-vla
reflex distill \
    --teacher lerobot/pi05_libero_finetuned_v044 \
    --steps 10000 \
    --batch 4

Inference

This checkpoint is the LeRobot PyTorch policy. For deployment, use Reflex's decomposed pi0.5 runtime which runs the VLM prefix once per chunk + the expert denoise N times — measured 9.79× theoretical speedup over monolithic ONNX (89.8% VLM compute share, 2026-04-23 microbench on A100-80GB).

reflex export <local-checkpoint> --decomposed
reflex serve <export-dir>

Latency on A100-80GB (measured 2026-04-23):

  • Full forward: 98ms / chunk
  • Cache hit (expert only): 10ms / chunk
  • Theoretical speedup: 9.79× (validated)

Fits Jetson Orin Nano 8 GB at FP16 (6.5 GB after FP16 conversion; teacher monolithic FP32 is 12.99 GB and doesn't fit).

Files

  • model.safetensors — student policy weights
  • config.json — LeRobot policy config (PI05Policy)
  • distill_provenance.json — training provenance (seeds, dataset, steps, base model)

License

Apache 2.0. Inherits from lerobot/pi05_libero_finetuned_v044. SnapFlow algorithm is the work of the paper authors (arxiv 2604.05656).

Citation

@article{snapflow2026,
  title={SnapFlow: Self-Distillation for Flow-Matching Vision-Language-Action Models},
  journal={arXiv preprint arXiv:2604.05656},
  year={2026}
}

If you use this checkpoint or the Reflex deployment toolchain, please cite both the SnapFlow paper and link to https://github.com/rylinjames/reflex-vla.

Reflex

Reflex VLA is the open-source deployment toolchain that produced this checkpoint and runs it ~9× faster on cheap edge hardware via decomposed VLM/expert ONNX export. Cross-family support: pi0, pi0.5, SmolVLA, GR00T, OpenVLA in one binary.

Downloads last month
14
Video Preview
loading

Model tree for Rylinjames/pi05-snapflow-distill-1nfe

Finetuned
(1)
this model

Paper for Rylinjames/pi05-snapflow-distill-1nfe