VLAarchTests Bench

This repository is a bundled benchmark workspace for future VLAarchTests model iterations, with emphasis on RLBench2 take tray out of oven.

Included

VLAarchTests benchmark code and generated public benchmark manifest
Patched AnyBimanual RLBench runtime used to execute the public oven benchmark
Patched PointFlowMatch + diffusion_policy source snapshot used to execute take shoes out of box
Official katefgroup/3d_flowmatch_actor PerAct2 checkpoint and public test data
Official PointFlowMatch 1717447341-indigo-quokka checkpoint for take_shoes_out_of_box
Public AnyBimanual LF baseline weights and comparison logs
Verified benchmark reports:
- oven subset run: 9/10
- oven full official run: 95/100 = 0.95
- shoes GPU search: non-zero success verified before later simulator crash
- hybrid public benchmark smoke outputs
DexGarmentLab benchmark-related validation scripts and validation logs

The strongest public out-of-box checkpoint validated here is:

Official oven result artifacts:

reports/3dfa_peract2_take_tray_out_of_oven_subset10/eval_after_official_ttm.json
reports/3dfa_peract2_take_tray_out_of_oven_full100/eval.json

Shoes result artifacts:

code/VLAarchtests4/code/VLAarchtests2_code/VLAarchtests/code/reveal_vla_bimanual/eval/public_benchmark_package.py
code/VLAarchtests4/code/VLAarchtests2_code/VLAarchtests/code/reveal_vla_bimanual/eval/run_rlbench_hybrid_smoke.py
third_party/AnyBimanual/third_party/RLBench/rlbench/bimanual_tasks/bimanual_take_tray_out_of_oven.py
third_party/AnyBimanual/third_party/RLBench/rlbench/task_ttms/bimanual_take_tray_out_of_oven.ttm
third_party/PointFlowMatch/pfp/envs/rlbench_env.py
third_party/PointFlowMatch/pfp/policy/fm_policy.py
scripts/run_pointflowmatch_take_shoes_out_of_box.sh

CoppeliaSim v4.1.0 binary runtime
Local Python environments under /workspace/envs
Full IsaacSim installation
Full DexGarmentLab simulator assets beyond the benchmark-related scripts and logs

See docs/ENVIRONMENT_NOTES.md for the runtime notes used in this workspace.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support