# VLAarchTests Bench This repository is a bundled benchmark workspace for future VLAarchTests model iterations, with emphasis on RLBench2 `take tray out of oven`. ## Included - VLAarchTests benchmark code and generated public benchmark manifest - Patched AnyBimanual RLBench runtime used to execute the public oven benchmark - Patched PointFlowMatch + `diffusion_policy` source snapshot used to execute `take shoes out of box` - Official `katefgroup/3d_flowmatch_actor` PerAct2 checkpoint and public test data - Official PointFlowMatch `1717447341-indigo-quokka` checkpoint for `take_shoes_out_of_box` - Public AnyBimanual LF baseline weights and comparison logs - Verified benchmark reports: - oven subset run: `9/10` - oven full official run: `95/100 = 0.95` - shoes GPU search: non-zero success verified before later simulator crash - hybrid public benchmark smoke outputs - DexGarmentLab benchmark-related validation scripts and validation logs ## Key Result The strongest public out-of-box checkpoint validated here is: - `models/3dfa_peract2/3dfa_peract2.pth` Official oven result artifacts: - `reports/3dfa_peract2_take_tray_out_of_oven_subset10/eval_after_official_ttm.json` - `reports/3dfa_peract2_take_tray_out_of_oven_full100/eval.json` Shoes result artifacts: - `reports/pointflowmatch_take_shoes_out_of_box_ep10_k50_gpu/summary.json` - `reports/pointflowmatch_take_shoes_out_of_box_ep10_k50_gpu/run.log` ## Important Code Paths - `code/VLAarchtests4/code/VLAarchtests2_code/VLAarchtests/code/reveal_vla_bimanual/eval/public_benchmark_package.py` - `code/VLAarchtests4/code/VLAarchtests2_code/VLAarchtests/code/reveal_vla_bimanual/eval/run_rlbench_hybrid_smoke.py` - `third_party/AnyBimanual/third_party/RLBench/rlbench/bimanual_tasks/bimanual_take_tray_out_of_oven.py` - `third_party/AnyBimanual/third_party/RLBench/rlbench/task_ttms/bimanual_take_tray_out_of_oven.ttm` - `third_party/PointFlowMatch/pfp/envs/rlbench_env.py` - `third_party/PointFlowMatch/pfp/policy/fm_policy.py` - `scripts/run_pointflowmatch_take_shoes_out_of_box.sh` ## External Dependencies Not Mirrored Here - CoppeliaSim v4.1.0 binary runtime - Local Python environments under `/workspace/envs` - Full IsaacSim installation - Full DexGarmentLab simulator assets beyond the benchmark-related scripts and logs See `docs/ENVIRONMENT_NOTES.md` for the runtime notes used in this workspace.