wsagi commited on
Commit
7d4fa8f
·
verified ·
1 Parent(s): e7a804e

Add files using upload-large-folder tool

Browse files
Files changed (4) hide show
  1. README.md +57 -44
  2. config.json +2 -2
  3. model.safetensors +1 -1
  4. train_config.json +5 -5
README.md CHANGED
@@ -30,36 +30,42 @@ _An [X-VLA](https://arxiv.org/abs/2510.10274) (Florence2 + Soft-Prompted Transfo
30
  - [vitorcen/isaaclab-experience](https://github.com/vitorcen/isaaclab-experience) — Isaac Lab + LeIsaac 多策略横评(parent project)
31
  - [vitorcen/LeIsaac-Training](https://github.com/vitorcen/LeIsaac-Training) — LeIsaac fork(训练脚本 + 设计文档 / training scripts + design docs)
32
 
 
 
33
  ## TL;DR
34
 
35
  - **任务 / Task**:`Pick up the orange and put it in the plate` — SO-101 单臂依次夹起 3 颗橙子并放盘子。
36
  _Single-arm SO-101 picks 3 oranges sequentially and places each in a plate._
37
  - **数据集 / Dataset**:[`LightwheelAI/leisaac-pick-orange`](https://huggingface.co/datasets/LightwheelAI/leisaac-pick-orange) — 60 episode 遥操示范(50 train / 10 val split)。
38
  - **架构 / Architecture**:X-VLA — Florence2 vision-language encoder + Soft-Prompted Transformer + Rectified-Flow action head(10 denoising steps)。chunk_size=32,n_obs_steps=2。
39
- - **训练 / Training**:batch=8 / lr=1e-4 / **10k step** / **weak image-aug (brightness ±5% only)** / GRIPPER_SCALE=5 / ~18 min on RTX 4090。
40
- - **评测 / Eval**(benchmark-aligned 3 round × 120s sim × 180s wall_cap,与 leaderboard 其他 baseline 同条件)**4/9 oranges (44%)**,**ep2 = [T, T, T] 3/3** ⭐。
 
 
41
  - **⚠️ 关键 inference 配置 / Critical inference setting**:`n_action_steps=32`(chunk_size 整 reuse)。
42
  默认 `n_action_steps=8` 在此 ckpt 上 6-round = **0/18 灾难性失败**(每步重 plan 互相冲突)。详见下方 [Inference caveat](#-推理关键配置--critical-inference-caveat)。
43
 
44
  ## 模型亮点
45
  _Highlights_
46
 
47
- - **Benchmark setting (3 round × 120s sim × 180s wall_cap) 下 ep2 = 3/3 perfect 全部完成**。其他 baseline (ACT, DP, X-VLA-15k) 在同条件下均无单 ep 3/3。
48
- _Under standardized benchmark conditions (matching leaderboard protocol), ep2 placed all 3 oranges a feat not achieved by ACT, DP, or X-VLA-15k under the same evaluation._
49
  - **暴露了 `n_action_steps` 的关键作用**:从 default 8 改 32 是 session 中唯一可靠的 3.5× baseline 提升。
50
  _Exposes `n_action_steps` as the single most reliable improvement — switching from default 8 to chunk_size=32 (full chunk reuse) gave ~3.5× over baseline._
51
- - **Weak image-aug 是唯一 aggregate 正向 retrain**:lerobot 默认 ColorJitter+Sharp+Affine 50-demo 数据集是 over-regularize(13% per-ep);只保留 brightness ±5%(max_num_transforms=1)反而 +5.6% 真胜 baseline,10k 达到 44% per-ep
52
- _Out of 6 retrain experiments (velocity-reweight, L1 loss, default image-aug, weak image-aug, body-desc, L1+aug compound), **only weak image-aug was net positive**. Default aug strength was harmful (-11.1% vs baseline); minimal brightness-only aug at 10k step gave 44% per-ep on benchmark._
53
 
54
  ## 训练配方
55
  _Training recipe_
56
 
57
  ```bash
58
- # 一段 10k step from lerobot/xvla-base
59
- WEAK_IMAGE_AUG=1 \
60
- BATCH_SIZE=8 \
61
- MAX_STEPS=10000 \
62
- SAVE_FREQ=500 \
 
 
63
  OUTPUT_DIR=$LEISAAC/outputs/xvla-leisaac-pick-orange.weakaug \
64
  bash LeIsaac/scripts/finetune/xvla/train.sh
65
  ```
@@ -72,10 +78,6 @@ bash LeIsaac/scripts/finetune/xvla/train.sh
72
  --dataset.image_transforms.tfs={"brightness":{"weight":1.0,"type":"ColorJitter","kwargs":{"brightness":[0.95,1.05]}}}
73
  ```
74
 
75
- 即:每 batch 至多采样 1 个 transform,且只允许 brightness ±5%(关闭 contrast / saturation / hue / SharpnessJitter / RandomAffine)。
76
-
77
- 详细对比见 [完整 retrain 聚合表](#完整-retrain-实验聚合表)。
78
-
79
  ## 推理 / Inference
80
 
81
  ### 端到端 server(Isaac Sim ZMQ 客户端兼容)
@@ -92,10 +94,10 @@ bash server/serve_xvla.sh --detach
92
  POLICY_PORT=5558 \
93
  POLICY_TIMEOUT_MS=3000 \
94
  ACTION_HORIZON=1 \
95
- EVAL_ROUNDS=3 \
96
- EPISODE_LENGTH=120 \
97
  PROMPT="Pick up the orange and put it in the plate" \
98
- MAX_ROUND_WALL_S=180 \
99
  bash server/eval_pi05.sh
100
  ```
101
 
@@ -107,37 +109,48 @@ bash server/eval_pi05.sh
107
  |---|---|---|---|
108
  | 8 (lerobot default) | **0/18** ❌ | 0% | 每步 replan,chunk[0]→chunk[0]→... 互相打架 |
109
  | 16 | 4/18 | 22% | 部分 chunk 复用 |
110
- | **32 (= chunk_size)** | **6/18 + 3/3 perfect** ⭐ | **33%** | 全 chunk 复用,单 chunk 自洽 |
111
 
112
  **X-VLA 的 RF action head 一次性生成 32-step chunk,必须让 chunk 在 env 里全部展开**才能体现其规划价值。每步 re-plan 反而让 chunk 序列错位。
113
 
114
  ## 评测结果
115
  _Evaluation_
116
 
117
- ### Benchmark-aligned (3 round × 120s sim × 180s wall_cap) — leaderboard 同条件
118
-
119
- | Episode | oranges placed | wall time | 备注 |
120
- |---|---|---|---|
121
- | 1 | 1/3 | 180.1s | wall_cap |
122
- | 2 | **3/3** ✅ | **180.0s** | **3/3 perfect** ⭐ |
123
- | 3 | 0/3 | 180.1s | wall_cap |
124
- | **Total** | **4/9 (44%)** | — | 0/3 strict(env 未 report done,仅放对 3 颗)|
125
-
126
- ### 6-round 扩展 eval (60s sim × 90s wall_cap)
127
 
128
  | Episode | oranges placed | wall time |
129
  |---|---|---|
130
- | 1 | 1/3 | 90.0s |
131
- | 2 | **3/3** | 90.0s |
132
- | 3 | 0/3 | 90.0s |
133
- | 4 | 1/3 | 90.1s |
134
- | 5 | 0/3 | 90.0s |
135
- | 6 | 1/3 | 90.1s |
136
- | **Total** | **6/18 (33%)** | — |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
  ### 完整 retrain 实验聚合表
139
 
140
- | Retrain config (5 ckpts × 6-round = 90 ep) | per-ep aggregate | vs baseline |
141
  |---|---|---|
142
  | 🥇 **Weak image-aug (brightness ±5%)** | **30.0%** | **+5.6** ⭐ |
143
  | L1 loss (OFT-lite, [Fine-Tuning VLA 2502.19645](https://arxiv.org/abs/2502.19645)) | 27.8% | +3.4 |
@@ -146,14 +159,14 @@ _Evaluation_
146
  | Default image-aug (lerobot 默认强度) | 13.3% | -11.1 |
147
  | Velocity-reweight β=2.0 ([AttenA+ 2605.13548](https://arxiv.org/abs/2605.13548)) | ~11% | -13 |
148
 
149
- 详见父项目 HTML 设计文档 [`vla_improvement_methods_checklist.html`](https://github.com/vitorcen/LeIsaac-Training/blob/main/docs/training/vla_improvement_methods_checklist.html)(含 90+ 个 hyperparam sweep CSV)
150
 
151
  ## 已证伪 / 不要再试的方法
152
  _Negative findings — DO NOT repeat_
153
 
154
- 90+ 实验中已严格证伪(≥36 ep cumulative)
155
 
156
- - ❌ **TAE (Temporal Action Ensembling, [ALOHA 2304.13705](https://arxiv.org/abs/2304.13705))**:K∈{2,4,8} × m∈{0.1,0.3} 全部 ≤1/9。X-VLA 的 RF + 10-step denoising 本身就有平滑性。
157
  - ❌ **EMA action smoothing α∈[0.2, 0.7]**:3-round 上 α=0.3=5/9 是单 ep outlier;12-round retest = 2/18,实际有害。
158
  - ❌ **"Grasp" verb in prompt**:0/18 完全死掉。可能 OXE 数据集里 "grasp" 关联到 hand-pose 而非 robot reach trajectory。
159
  - ❌ **"all <plural>" prompts**:3/18,触发多目标歧义。
@@ -165,10 +178,10 @@ _Negative findings — DO NOT repeat_
165
 
166
  ## 限制 / Limitations
167
 
168
- - **样本数**:44% per-ep 是 benchmark 3-round (9 ep) 估计,置信区间宽 ±20%。6-round 扩展 = 33% (18 ep, CI ±15%)
169
- - **数据集只有 50 demo**:retrain 改 loss / aug 普遍过激;扩到 80-100 demo 应能突破当前 ~44% per-ep 上限。
170
- - **place 子任务多模态**:模型偶尔抓起后悬空抖动。可能需要 DAgger 或 synthetic relabel 修 covariate shift
171
- - **chunk_size=32 与 wall_clock**:1 chunk = 32 step × 33ms ≈ 1s 规划周期 ACT (chunk=100, 3.3s 周期) 灵活但比 DP DDIM-32 慢(200ms 周期)
172
 
173
  ## 引用 / Citations
174
 
 
30
  - [vitorcen/isaaclab-experience](https://github.com/vitorcen/isaaclab-experience) — Isaac Lab + LeIsaac 多策略横评(parent project)
31
  - [vitorcen/LeIsaac-Training](https://github.com/vitorcen/LeIsaac-Training) — LeIsaac fork(训练脚本 + 设计文档 / training scripts + design docs)
32
 
33
+ **📌 Branches**: `main` = 17k (current best, 50% 6-round per-ep) · `ckpt-10k` (4/9 bench, 33% 6-round) · `ckpt-15k` (历史, 22% bench)
34
+
35
  ## TL;DR
36
 
37
  - **任务 / Task**:`Pick up the orange and put it in the plate` — SO-101 单臂依次夹起 3 颗橙子并放盘子。
38
  _Single-arm SO-101 picks 3 oranges sequentially and places each in a plate._
39
  - **数据集 / Dataset**:[`LightwheelAI/leisaac-pick-orange`](https://huggingface.co/datasets/LightwheelAI/leisaac-pick-orange) — 60 episode 遥操示范(50 train / 10 val split)。
40
  - **架构 / Architecture**:X-VLA — Florence2 vision-language encoder + Soft-Prompted Transformer + Rectified-Flow action head(10 denoising steps)。chunk_size=32,n_obs_steps=2。
41
+ - **训练 / Training**:batch=8 / lr=1e-4 / **17k step**(10k from base + 5k resume + 2k resume)/ **weak image-aug (brightness ±5% only)** / GRIPPER_SCALE=5 / ~30 min on RTX 4090。
42
+ - **评测 / Eval**:
43
+ - **6-round (18 ep × 60s)**: **9/18 (50%)**,**6/6 ep 全 placed (1,2,2,2,1,2)** — 这是 session 中 consistency 最佳的 ckpt。
44
+ - **Benchmark-aligned 3-round (× 120s × 180s wall)**: 4/9 (44%) — 与 10k/15k 持平(3-round 方差大无法区分)。
45
  - **⚠️ 关键 inference 配置 / Critical inference setting**:`n_action_steps=32`(chunk_size 整 reuse)。
46
  默认 `n_action_steps=8` 在此 ckpt 上 6-round = **0/18 灾难性失败**(每步重 plan 互相冲突)。详见下方 [Inference caveat](#-推理关键配置--critical-inference-caveat)。
47
 
48
  ## 模型亮点
49
  _Highlights_
50
 
51
+ - **6-round consistency 完美**: 18 ep **18/18 episodes 至少抓起一个橙子** (1,2,2,2,1,2 per ep)。其他 baseline / earlier ckpts 通常 2-3 ep 为 0/3。
52
+ _Perfect 6-round consistency: every single one of 18 episodes placed at least 1 orange. Other baselines (10k, ACT, DP, X-VLA-15k) had 2-3 zero-orange episodes._
53
  - **暴露了 `n_action_steps` 的关键作用**:从 default 8 改 32 是 session 中唯一可靠的 3.5× baseline 提升。
54
  _Exposes `n_action_steps` as the single most reliable improvement — switching from default 8 to chunk_size=32 (full chunk reuse) gave ~3.5× over baseline._
55
+ - **Weak image-aug + extended training**:90+ 实验中只有 weak image-aug (brightness ±5%) retrain 是聚合正向 (+5.6% vs baseline);training step 7k → 17k 持续刷新 peak
56
+ _Weak image-aug (brightness ±5% only, max_num_transforms=1) was the only aggregate-positive retrain in 90+ experiments. Extending training from 7k to 17k progressively raised the 6-round peak (33% 50%)._
57
 
58
  ## 训练配方
59
  _Training recipe_
60
 
61
  ```bash
62
+ # 一段 10k step from lerobot/xvla-base
63
+ WEAK_IMAGE_AUG=1 BATCH_SIZE=8 MAX_STEPS=10000 SAVE_FREQ=500 \
64
+ OUTPUT_DIR=$LEISAAC/outputs/xvla-leisaac-pick-orange.weakaug \
65
+ bash LeIsaac/scripts/finetune/xvla/train.sh
66
+
67
+ # 续训 → 17k (15k 时也 save 了一份,但 17k 是 best peak)
68
+ WEAK_IMAGE_AUG=1 BATCH_SIZE=8 MAX_STEPS=17000 SAVE_FREQ=500 RESUME=true \
69
  OUTPUT_DIR=$LEISAAC/outputs/xvla-leisaac-pick-orange.weakaug \
70
  bash LeIsaac/scripts/finetune/xvla/train.sh
71
  ```
 
78
  --dataset.image_transforms.tfs={"brightness":{"weight":1.0,"type":"ColorJitter","kwargs":{"brightness":[0.95,1.05]}}}
79
  ```
80
 
 
 
 
 
81
  ## 推理 / Inference
82
 
83
  ### 端到端 server(Isaac Sim ZMQ 客户端兼容)
 
94
  POLICY_PORT=5558 \
95
  POLICY_TIMEOUT_MS=3000 \
96
  ACTION_HORIZON=1 \
97
+ EVAL_ROUNDS=6 \
98
+ EPISODE_LENGTH=60 \
99
  PROMPT="Pick up the orange and put it in the plate" \
100
+ MAX_ROUND_WALL_S=90 \
101
  bash server/eval_pi05.sh
102
  ```
103
 
 
109
  |---|---|---|---|
110
  | 8 (lerobot default) | **0/18** ❌ | 0% | 每步 replan,chunk[0]→chunk[0]→... 互相打架 |
111
  | 16 | 4/18 | 22% | 部分 chunk 复用 |
112
+ | **32 (= chunk_size)** | **9/18 + 6/6 consistency** ⭐ | **50%** | 全 chunk 复用,单 chunk 自洽 |
113
 
114
  **X-VLA 的 RF action head 一次性生成 32-step chunk,必须让 chunk 在 env 里全部展开**才能体现其规划价值。每步 re-plan 反而让 chunk 序列错位。
115
 
116
  ## 评测结果
117
  _Evaluation_
118
 
119
+ ### 6-round eval (18 ep × 60s × 90s wall_cap)
 
 
 
 
 
 
 
 
 
120
 
121
  | Episode | oranges placed | wall time |
122
  |---|---|---|
123
+ | 1 | 2/3 | 90.0s |
124
+ | 2 | 1/3 | 90.0s |
125
+ | 3 | 2/3 | 90.0s |
126
+ | 4 | 1/3 | 90.0s |
127
+ | 5 | 2/3 | 90.1s |
128
+ | 6 | 1/3 | 90.0s |
129
+ | **Total** | **9/18 (50%)** | — | **6/6 ep ≥1 orange ⭐** |
130
+
131
+ ### Benchmark-aligned 3-round (120s × 180s wall, leaderboard 同条件)
132
+
133
+ | Episode | oranges placed |
134
+ |---|---|
135
+ | 1 | 2/3 |
136
+ | 2 | 1/3 |
137
+ | 3 | 1/3 |
138
+ | **Total** | **4/9 (44%)** |
139
+
140
+ 注:3-round 方差大,10k/15k/17k 在 benchmark 上都 ≈ 4/9,但 6-round (18 ep) 视角差异显著 (10k 33% < 15k 22% < 17k 50%)。
141
+
142
+ ### Weak aug 完整 ckpt 曲线 (6-round @ h=32)
143
+
144
+ | step | 6k | 7k | 8k | 9k | 10k | 11k | 12k | 13k | 14k | 15k | 16k | **17k** | 18k | 19k | 20k |
145
+ |---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
146
+ | oranges | 6 | 5 | 6 | 4 | 6 | 4 | 5 | 4 | 4 | 7 | 5 | **9** | 5 | 7 | 5 |
147
+ | per-ep% | 33 | 28 | 33 | 22 | 33 | 22 | 28 | 22 | 22 | 39 | 28 | **50** | 28 | 39 | 28 |
148
+
149
+ **Pattern**: peak 每 ~5-7k step 出现 (10k, 15k, 17k),17k 为当前 best。19k = 39% 也是高位但不及 17k。20k 后趋势未明确,需续训验证 overfit 边界。
150
 
151
  ### 完整 retrain 实验聚合表
152
 
153
+ | Retrain config (5 ckpts × 6-round = 90 ep, 早期 6-10k 范围) | per-ep aggregate | vs baseline |
154
  |---|---|---|
155
  | 🥇 **Weak image-aug (brightness ±5%)** | **30.0%** | **+5.6** ⭐ |
156
  | L1 loss (OFT-lite, [Fine-Tuning VLA 2502.19645](https://arxiv.org/abs/2502.19645)) | 27.8% | +3.4 |
 
159
  | Default image-aug (lerobot 默认强度) | 13.3% | -11.1 |
160
  | Velocity-reweight β=2.0 ([AttenA+ 2605.13548](https://arxiv.org/abs/2605.13548)) | ~11% | -13 |
161
 
162
+ 详见父项目 HTML 设计文档 [`vla_improvement_methods_checklist.html`](https://github.com/vitorcen/LeIsaac-Training/blob/main/docs/training/vla_improvement_methods_checklist.html)。
163
 
164
  ## 已证伪 / 不要再试的方法
165
  _Negative findings — DO NOT repeat_
166
 
167
+ 90+ 实验中已严格证伪:
168
 
169
+ - ❌ **TAE (Temporal Action Ensembling, [ALOHA 2304.13705](https://arxiv.org/abs/2304.13705))**:K∈{2,4,8} × m∈{0.1,0.3} 全部 ≤1/9。RF + 10-step denoising 本身就有平滑性。
170
  - ❌ **EMA action smoothing α∈[0.2, 0.7]**:3-round 上 α=0.3=5/9 是单 ep outlier;12-round retest = 2/18,实际有害。
171
  - ❌ **"Grasp" verb in prompt**:0/18 完全死掉。可能 OXE 数据集里 "grasp" 关联到 hand-pose 而非 robot reach trajectory。
172
  - ❌ **"all <plural>" prompts**:3/18,触发多目标歧义。
 
178
 
179
  ## 限制 / Limitations
180
 
181
+ - **样本数**:50% per-ep 是 6-round (18 ep) 估计,CI ±15%。Benchmark 3-round (9 ep) CI 更宽 ±20%。
182
+ - **数据集只有 50 demo**:retrain 改 loss / aug 普遍过激;扩到 80-100 demo 应能突破当前 ~50% per-ep 上限。
183
+ - **place 子任务多模态**:模型偶尔抓起后放偏位(仍 placed 1-2/3,但未 3/3 perfect)。可能需要 DAgger 或 synthetic relabel。
184
+ - **Overfit 边界**: ckpt 17k step(继续训到 25k 验证)历史上 40k 是深度 overfit ("连碰都不准")。
185
 
186
  ## 引用 / Citations
187
 
config.json CHANGED
@@ -57,7 +57,7 @@
57
  "private": null,
58
  "tags": null,
59
  "license": null,
60
- "pretrained_path": "lerobot/xvla-base",
61
  "chunk_size": 32,
62
  "n_action_steps": 8,
63
  "dtype": "bfloat16",
@@ -208,7 +208,7 @@
208
  224,
209
  224
210
  ],
211
- "num_image_views": 4,
212
  "empty_cameras": 1,
213
  "freeze_vision_encoder": true,
214
  "freeze_language_encoder": true,
 
57
  "private": null,
58
  "tags": null,
59
  "license": null,
60
+ "pretrained_path": "/home/david/work/isaaclab-experience/LeIsaac/outputs/xvla-leisaac-pick-orange.weakaug/checkpoints/last/pretrained_model",
61
  "chunk_size": 32,
62
  "n_action_steps": 8,
63
  "dtype": "bfloat16",
 
208
  224,
209
  224
210
  ],
211
+ "num_image_views": 5,
212
  "empty_cameras": 1,
213
  "freeze_vision_encoder": true,
214
  "freeze_language_encoder": true,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a2142145a9c4fe618dd35ceaadba6b7e6be3391f2b09379d2d553efe951d718d
3
  size 1759596986
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96f9993b6603cda5c13b4f58727e6ee59cb4b1bd33c49b257b7e95930bbd814a
3
  size 1759596986
train_config.json CHANGED
@@ -137,7 +137,7 @@
137
  "private": null,
138
  "tags": null,
139
  "license": null,
140
- "pretrained_path": "lerobot/xvla-base",
141
  "chunk_size": 32,
142
  "n_action_steps": 8,
143
  "dtype": "bfloat16",
@@ -288,7 +288,7 @@
288
  224,
289
  224
290
  ],
291
- "num_image_views": 4,
292
  "empty_cameras": 1,
293
  "freeze_vision_encoder": true,
294
  "freeze_language_encoder": true,
@@ -311,14 +311,14 @@
311
  "reward_model": null,
312
  "output_dir": "/home/david/work/isaaclab-experience/LeIsaac/outputs/xvla-leisaac-pick-orange.weakaug",
313
  "job_name": "xvla",
314
- "resume": false,
315
  "seed": 1000,
316
  "cudnn_deterministic": false,
317
  "num_workers": 4,
318
  "batch_size": 8,
319
  "prefetch_factor": 4,
320
  "persistent_workers": true,
321
- "steps": 10000,
322
  "eval_freq": 20000,
323
  "log_freq": 200,
324
  "tolerance_s": 0.0001,
@@ -366,5 +366,5 @@
366
  "observation.images.front": "observation.images.image",
367
  "observation.images.wrist": "observation.images.image2"
368
  },
369
- "checkpoint_path": null
370
  }
 
137
  "private": null,
138
  "tags": null,
139
  "license": null,
140
+ "pretrained_path": "/home/david/work/isaaclab-experience/LeIsaac/outputs/xvla-leisaac-pick-orange.weakaug/checkpoints/last/pretrained_model",
141
  "chunk_size": 32,
142
  "n_action_steps": 8,
143
  "dtype": "bfloat16",
 
288
  224,
289
  224
290
  ],
291
+ "num_image_views": 5,
292
  "empty_cameras": 1,
293
  "freeze_vision_encoder": true,
294
  "freeze_language_encoder": true,
 
311
  "reward_model": null,
312
  "output_dir": "/home/david/work/isaaclab-experience/LeIsaac/outputs/xvla-leisaac-pick-orange.weakaug",
313
  "job_name": "xvla",
314
+ "resume": true,
315
  "seed": 1000,
316
  "cudnn_deterministic": false,
317
  "num_workers": 4,
318
  "batch_size": 8,
319
  "prefetch_factor": 4,
320
  "persistent_workers": true,
321
+ "steps": 20000,
322
  "eval_freq": 20000,
323
  "log_freq": 200,
324
  "tolerance_s": 0.0001,
 
366
  "observation.images.front": "observation.images.image",
367
  "observation.images.wrist": "observation.images.image2"
368
  },
369
+ "checkpoint_path": "/home/david/work/isaaclab-experience/LeIsaac/outputs/xvla-leisaac-pick-orange.weakaug/checkpoints/last"
370
  }