Instructions to use wsagi/ACT-PickOrange with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use wsagi/ACT-PickOrange with LeRobot:
- Notebooks
- Google Colab
- Kaggle
Initial release: ACT 10k step + horizon=32 1/1 success on LeIsaac SO-101 PickOrange
Browse files- .gitattributes +1 -0
- README.md +180 -0
- act-pick-orange.png +3 -0
- config.json +71 -0
- model.safetensors +3 -0
- policy_postprocessor.json +32 -0
- policy_postprocessor_step_0_unnormalizer_processor.safetensors +3 -0
- policy_preprocessor.json +64 -0
- policy_preprocessor_step_3_normalizer_processor.safetensors +3 -0
- train_config.json +206 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
act-pick-orange.png filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,180 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: lerobot
|
| 4 |
+
pipeline_tag: robotics
|
| 5 |
+
tags:
|
| 6 |
+
- act
|
| 7 |
+
- lerobot
|
| 8 |
+
- so101
|
| 9 |
+
- leisaac
|
| 10 |
+
- pick-orange
|
| 11 |
+
- isaac-sim
|
| 12 |
+
datasets:
|
| 13 |
+
- LightwheelAI/leisaac-pick-orange
|
| 14 |
+
language:
|
| 15 |
+
- en
|
| 16 |
+
base_model: lerobot/act
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# ACT-PickOrange
|
| 20 |
+
|
| 21 |
+
针对 [LeIsaac SO-101 PickOrange](https://github.com/LightwheelAI/leisaac) 任务从头训练的 [ACT (Action Chunking Transformer)](https://tonyzhaozh.github.io/aloha/) 策略。
|
| 22 |
+
_An [ACT (Action Chunking Transformer)](https://tonyzhaozh.github.io/aloha/) policy trained from scratch on the [LeIsaac SO-101 PickOrange](https://github.com/LightwheelAI/leisaac) task._
|
| 23 |
+
|
| 24 |
+

|
| 25 |
+
|
| 26 |
+
## TL;DR
|
| 27 |
+
|
| 28 |
+
- **任务 / Task**:`Pick up the orange and place it on the plate` — SO-101 单臂依次夹起 3 颗橙子并放盘子。
|
| 29 |
+
_Single-arm SO-101 picks 3 oranges sequentially and places each on a plate._
|
| 30 |
+
- **数据集 / Dataset**:[`LightwheelAI/leisaac-pick-orange`](https://huggingface.co/datasets/LightwheelAI/leisaac-pick-orange) — 60 episode 遥操示范。
|
| 31 |
+
- **架构 / Architecture**:ACT chunk_size=100,~80M 参数,纯 vision + joint state → action chunk regression(无 LLM / 无 diffusion)。
|
| 32 |
+
- **训练 / Training**:batch=8 / lr=1e-5 / 10k step / **关闭图像增强**,~5h on RTX 4090。
|
| 33 |
+
- **评测 / Eval**:Isaac Sim 5.1 + LeIsaac,**1/1 success @ 120s sim time**(3 颗全部放盘成功)。
|
| 34 |
+
- **⚠️ 关键 inference 配置 / Critical inference setting**:`policy_action_horizon=32`。
|
| 35 |
+
默认值 16 会让模型卡在第二颗橙子(爪子抖),8 会卡在第一颗。详见下方 [Inference caveat](#-推理关键配置--critical-inference-caveat)。
|
| 36 |
+
|
| 37 |
+
## 模型亮点
|
| 38 |
+
_Highlights_
|
| 39 |
+
|
| 40 |
+
- **复刻 + 验证 [shadowHokage/act_policy](https://huggingface.co/shadowHokage/act_policy) 的配方**,得到等价或更好的成功率。
|
| 41 |
+
_Reproduces and validates the [shadowHokage/act_policy](https://huggingface.co/shadowHokage/act_policy) recipe with comparable or better success rate._
|
| 42 |
+
- **暴露了 LeIsaac 默认 `policy_action_horizon=16` 的隐性陷阱**:chunk_size=100 的 ACT 需要 horizon ≥ 32 才能让宏观运动段完整执行,详见 README 的诊断章节。
|
| 43 |
+
_Exposes a hidden trap in LeIsaac's default `policy_action_horizon=16`: ACT models with chunk_size=100 require horizon ≥ 32 to let the macro-motion segment of each chunk execute._
|
| 44 |
+
- 无 image augmentation、无 weight decay 调参、无 special trick — 干净的 ACT baseline。
|
| 45 |
+
|
| 46 |
+
## 训练配方
|
| 47 |
+
_Training recipe_
|
| 48 |
+
|
| 49 |
+
| 项 / Item | 值 / Value |
|
| 50 |
+
|---|---|
|
| 51 |
+
| Dataset | `LightwheelAI/leisaac-pick-orange` (60 ep, dual-cam 480×640 RGB + 6 DOF state, 30 Hz) |
|
| 52 |
+
| Policy | `act` (LeRobot 实现 / LeRobot impl.) |
|
| 53 |
+
| Backbone | ResNet18 vision encoder + Transformer encoder/decoder |
|
| 54 |
+
| `chunk_size` | 100 |
|
| 55 |
+
| `n_action_steps` | 100 |
|
| 56 |
+
| Batch size | 8 |
|
| 57 |
+
| Optimizer | AdamW |
|
| 58 |
+
| Learning rate | 1e-5 (constant) |
|
| 59 |
+
| Steps | 10,000 |
|
| 60 |
+
| Image augmentation | **disabled** |
|
| 61 |
+
| Hardware | RTX 4090 (24 GB) |
|
| 62 |
+
| Wall-clock | ~5 hours |
|
| 63 |
+
| Recipe credit | [shadowHokage/act_policy](https://huggingface.co/shadowHokage/act_policy) |
|
| 64 |
+
|
| 65 |
+
训练入口脚本在我们的 LeIsaac fork:[`scripts/training/act/train.sh`](https://github.com/vitorcen/LeIsaac/blob/main/scripts/training/act/train.sh)。
|
| 66 |
+
_Training entrypoint script lives in our LeIsaac fork: [`scripts/training/act/train.sh`](https://github.com/vitorcen/LeIsaac/blob/main/scripts/training/act/train.sh)._
|
| 67 |
+
|
| 68 |
+
## 评测结果
|
| 69 |
+
_Eval results_
|
| 70 |
+
|
| 71 |
+
| 配置 / Config | 第 1 颗 | 第 2 颗 | 第 3 颗 | Episode 成功率 |
|
| 72 |
+
|---|---|---|---|---|
|
| 73 |
+
| horizon=8 | 🔴 卡死(夹住不动) | — | — | 0/1 |
|
| 74 |
+
| horizon=16 | ✅ 成功 | 🟡 爪子抖 / muting | — | 0/1 |
|
| 75 |
+
| **horizon=32** | ✅ 成功 | ✅ 折腾后成功 | ✅ 折腾后成功 | **1/1** ✅ |
|
| 76 |
+
|
| 77 |
+
测试环境 / Test setup:Isaac Sim 5.1,task `LeIsaac-SO101-PickOrange-v0`,`episode_length_s=120`,`step_hz=30`,dual-cam 观测。
|
| 78 |
+
_Test setup: Isaac Sim 5.1, task `LeIsaac-SO101-PickOrange-v0`, `episode_length_s=120`, `step_hz=30`, dual-cam observations._
|
| 79 |
+
|
| 80 |
+
**单 sample 警告 / Single-sample caveat**:以上 1/1 是单一 episode 结果,未跑统计意义上的多轮平均。但 horizon=8 / 16 / 32 三个失败模式的 monotonic 趋势 (失败 → 部分失败 → 成功) 足以做 falsification — 不是模型问题,是配置问题。
|
| 81 |
+
_The 1/1 success rate is from a single episode, not statistically averaged. However, the monotonic failure-mode pattern across horizon=8/16/32 (stuck → jitter → success) is sufficient as a falsification: this is a configuration issue, not a model capability issue._
|
| 82 |
+
|
| 83 |
+
## ⚠️ 推理关键配置 / Critical inference caveat
|
| 84 |
+
|
| 85 |
+
**ACT chunk_size=100 + 默认 horizon=16 = 第二颗橙子永远过不去。** 这不是 ACT 的弱点,是 LeIsaac 默认配置的隐性陷阱。
|
| 86 |
+
_**ACT chunk_size=100 + the default horizon=16 will deadlock on the 2nd orange.** This is not an ACT weakness; it's a hidden trap in LeIsaac's default config._
|
| 87 |
+
|
| 88 |
+
### 根因 / Root cause
|
| 89 |
+
|
| 90 |
+
ACT 每个 chunk 输出 100 步动作,是一段**完整规划**:前 ~10 步是"启动 / 加速",中段 (step 20-80) 才是真正的**宏观运动**(接近 → 夹起 → 提起 → 运送 → 释放)。LeRobot async client 用直接窗口 (receding horizon),每 `policy_action_horizon` 步重新查询一次。
|
| 91 |
+
_Each ACT chunk outputs a 100-step planned trajectory: the first ~10 steps are "startup", and steps 20-80 are the macro-motion (approach → grasp → lift → transport → release). The LeRobot async client uses a sliding window, re-querying every `policy_action_horizon` steps._
|
| 92 |
+
|
| 93 |
+
- horizon=8 → 每次只执行前 8 步就丢掉重 query → 永远在执行"启动段",**根本到不了宏观运动** → 卡死。
|
| 94 |
+
_horizon=8 → only the first 8 startup steps are ever executed → the macro-motion never fires → deadlock._
|
| 95 |
+
- horizon=16 → 够第 1 颗的简单"靠近→夹起",但第 2 颗的"放→后退→接近第 2 颗"复杂段需要更长执行窗 → 模型 OOD + 短 horizon 双重打击 → 抖。
|
| 96 |
+
_horizon=16 → enough for the simple "approach → grasp" of orange #1, but the post-1st-orange transition demands a longer execution window → OOD state + short horizon compound → jitter._
|
| 97 |
+
- horizon=32 → 给 macro-motion 完整执行机会,1/1 通过。
|
| 98 |
+
|
| 99 |
+
### 推荐配置 / Recommended settings
|
| 100 |
+
|
| 101 |
+
```bash
|
| 102 |
+
--policy_type=lerobot-act
|
| 103 |
+
--policy_action_horizon=32
|
| 104 |
+
--policy_checkpoint_path=<path-to-this-model>
|
| 105 |
+
--step_hz=30 # 对齐 dataset 30Hz / matches dataset 30Hz
|
| 106 |
+
--episode_length_s=120
|
| 107 |
+
```
|
| 108 |
+
|
| 109 |
+
## 使用方法
|
| 110 |
+
_Usage_
|
| 111 |
+
|
| 112 |
+
### 1. 启动 LeRobot async policy_server
|
| 113 |
+
|
| 114 |
+
```bash
|
| 115 |
+
pip install lerobot
|
| 116 |
+
python -m lerobot.async_inference.policy_server --host 0.0.0.0 --port 8080
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
### 2. 客户端启动 LeIsaac eval
|
| 120 |
+
|
| 121 |
+
通过我们的 [vitorcen/LeIsaac](https://github.com/vitorcen/LeIsaac) fork:
|
| 122 |
+
|
| 123 |
+
```bash
|
| 124 |
+
cd LeIsaac
|
| 125 |
+
bash scripts/evaluation/run_eval.sh -- \
|
| 126 |
+
--task=LeIsaac-SO101-PickOrange-v0 \
|
| 127 |
+
--eval_rounds=3 \
|
| 128 |
+
--episode_length_s=120 \
|
| 129 |
+
--step_hz=30 \
|
| 130 |
+
--policy_type=lerobot-act \
|
| 131 |
+
--policy_host=127.0.0.1 --policy_port=8080 \
|
| 132 |
+
--policy_checkpoint_path=wsagi/ACT-PickOrange \
|
| 133 |
+
--policy_action_horizon=32 \
|
| 134 |
+
--policy_language_instruction="Pick up the orange and place it on the plate" \
|
| 135 |
+
--device=cuda --enable_cameras
|
| 136 |
+
```
|
| 137 |
+
|
| 138 |
+
`run_eval.sh` 自动按 user-patience cap 计算 wall-clock timeout,避免无意义等待慢推理。
|
| 139 |
+
_`run_eval.sh` auto-computes a user-patience wall-clock timeout so slow inference fails fast._
|
| 140 |
+
|
| 141 |
+
## 局限性
|
| 142 |
+
_Limitations_
|
| 143 |
+
|
| 144 |
+
- **数据集 OOD on 2nd-3rd orange**:dataset 60 episode × 每集 1 次"放第 N 颗"演示。第 2/3 颗的 state coverage 比第 1 颗稀疏一个数量级,model 在那里 monotonic 变难、动作变"折腾"。即便 horizon=32 救了形式上的成功率,**精度仍随颗数线性退化**。这是数据问题不是模型问题。
|
| 145 |
+
_**Dataset OOD on 2nd–3rd orange**: with 60 episodes × 1 "place N-th orange" demo each, state coverage drops by ~1 order of magnitude per orange. Even at horizon=32 the policy gets visibly more jittery on later oranges. This is a data issue, not a model issue._
|
| 146 |
+
- 三个独立架构 (我们的 ACT / Diffusion Policy / SmolVLA / 公开 shadowHokage ACT) 在同一 dataset 上 **共同 OOD on 3rd orange** — 全 family 共病。
|
| 147 |
+
- 无图像增强、无 domain randomization → real-world transfer 可能弱。本 ckpt 仅用于 Isaac Sim 仿真验证,不保证真机 deploy。
|
| 148 |
+
_No image augmentation or domain randomization → real-world transfer is likely weak. This checkpoint is only validated in Isaac Sim simulation; real-robot deployment is not guaranteed._
|
| 149 |
+
|
| 150 |
+
## 相关
|
| 151 |
+
_Related_
|
| 152 |
+
|
| 153 |
+
- 同任务对照 / Same-task comparisons:
|
| 154 |
+
- [`wsagi/DiffusionPolicy-PickOrange`](https://huggingface.co/wsagi/DiffusionPolicy-PickOrange) — 自训 Diffusion Policy (267M, DDIM 32-step swap)
|
| 155 |
+
- [`shadowHokage/act_policy`](https://huggingface.co/shadowHokage/act_policy) — 同配方公开 ckpt(我们的复刻参考)
|
| 156 |
+
- [`LightwheelAI/leisaac-pick-orange-v0`](https://huggingface.co/LightwheelAI/leisaac-pick-orange-v0) — GR00T N1.5 SOTA(30s 完成 3 颗)
|
| 157 |
+
- 完整训练 + eval 配方:[vitorcen/LeIsaac](https://github.com/vitorcen/LeIsaac) fork
|
| 158 |
+
|
| 159 |
+
## 致谢
|
| 160 |
+
_Acknowledgments_
|
| 161 |
+
|
| 162 |
+
- LeIsaac 团队 + LightwheelAI 提供任务环境和数据集
|
| 163 |
+
- LeRobot 团队提供 ACT 实现 + async inference 框架
|
| 164 |
+
- shadowHokage 公开训练配方作为复刻基线
|
| 165 |
+
|
| 166 |
+
## 引用
|
| 167 |
+
_Citation_
|
| 168 |
+
|
| 169 |
+
```bibtex
|
| 170 |
+
@inproceedings{zhao2023learning,
|
| 171 |
+
title={Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware},
|
| 172 |
+
author={Zhao, Tony Z. and Kumar, Vikash and Levine, Sergey and Finn, Chelsea},
|
| 173 |
+
booktitle={Robotics: Science and Systems},
|
| 174 |
+
year={2023}
|
| 175 |
+
}
|
| 176 |
+
```
|
| 177 |
+
|
| 178 |
+
## License
|
| 179 |
+
|
| 180 |
+
Apache-2.0
|
act-pick-orange.png
ADDED
|
Git LFS Details
|
config.json
ADDED
|
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"type": "act",
|
| 3 |
+
"n_obs_steps": 1,
|
| 4 |
+
"input_features": {
|
| 5 |
+
"observation.state": {
|
| 6 |
+
"type": "STATE",
|
| 7 |
+
"shape": [
|
| 8 |
+
6
|
| 9 |
+
]
|
| 10 |
+
},
|
| 11 |
+
"observation.images.front": {
|
| 12 |
+
"type": "VISUAL",
|
| 13 |
+
"shape": [
|
| 14 |
+
3,
|
| 15 |
+
480,
|
| 16 |
+
640
|
| 17 |
+
]
|
| 18 |
+
},
|
| 19 |
+
"observation.images.wrist": {
|
| 20 |
+
"type": "VISUAL",
|
| 21 |
+
"shape": [
|
| 22 |
+
3,
|
| 23 |
+
480,
|
| 24 |
+
640
|
| 25 |
+
]
|
| 26 |
+
}
|
| 27 |
+
},
|
| 28 |
+
"output_features": {
|
| 29 |
+
"action": {
|
| 30 |
+
"type": "ACTION",
|
| 31 |
+
"shape": [
|
| 32 |
+
6
|
| 33 |
+
]
|
| 34 |
+
}
|
| 35 |
+
},
|
| 36 |
+
"device": "cuda",
|
| 37 |
+
"use_amp": false,
|
| 38 |
+
"use_peft": false,
|
| 39 |
+
"push_to_hub": false,
|
| 40 |
+
"repo_id": null,
|
| 41 |
+
"private": null,
|
| 42 |
+
"tags": null,
|
| 43 |
+
"license": null,
|
| 44 |
+
"pretrained_path": null,
|
| 45 |
+
"chunk_size": 100,
|
| 46 |
+
"n_action_steps": 100,
|
| 47 |
+
"normalization_mapping": {
|
| 48 |
+
"VISUAL": "MEAN_STD",
|
| 49 |
+
"STATE": "MEAN_STD",
|
| 50 |
+
"ACTION": "MEAN_STD"
|
| 51 |
+
},
|
| 52 |
+
"vision_backbone": "resnet18",
|
| 53 |
+
"pretrained_backbone_weights": "ResNet18_Weights.IMAGENET1K_V1",
|
| 54 |
+
"replace_final_stride_with_dilation": false,
|
| 55 |
+
"pre_norm": false,
|
| 56 |
+
"dim_model": 512,
|
| 57 |
+
"n_heads": 8,
|
| 58 |
+
"dim_feedforward": 3200,
|
| 59 |
+
"feedforward_activation": "relu",
|
| 60 |
+
"n_encoder_layers": 4,
|
| 61 |
+
"n_decoder_layers": 1,
|
| 62 |
+
"use_vae": true,
|
| 63 |
+
"latent_dim": 32,
|
| 64 |
+
"n_vae_encoder_layers": 4,
|
| 65 |
+
"temporal_ensemble_coeff": null,
|
| 66 |
+
"dropout": 0.1,
|
| 67 |
+
"kl_weight": 10.0,
|
| 68 |
+
"optimizer_lr": 1e-05,
|
| 69 |
+
"optimizer_weight_decay": 0.0001,
|
| 70 |
+
"optimizer_lr_backbone": 1e-05
|
| 71 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b6f216a6f117b2b4a09af7706c10f48c4a7320d6d80172ce72b6f13d149032b2
|
| 3 |
+
size 206699736
|
policy_postprocessor.json
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"name": "policy_postprocessor",
|
| 3 |
+
"steps": [
|
| 4 |
+
{
|
| 5 |
+
"registry_name": "unnormalizer_processor",
|
| 6 |
+
"config": {
|
| 7 |
+
"eps": 1e-08,
|
| 8 |
+
"features": {
|
| 9 |
+
"action": {
|
| 10 |
+
"type": "ACTION",
|
| 11 |
+
"shape": [
|
| 12 |
+
6
|
| 13 |
+
]
|
| 14 |
+
}
|
| 15 |
+
},
|
| 16 |
+
"norm_map": {
|
| 17 |
+
"VISUAL": "MEAN_STD",
|
| 18 |
+
"STATE": "MEAN_STD",
|
| 19 |
+
"ACTION": "MEAN_STD"
|
| 20 |
+
}
|
| 21 |
+
},
|
| 22 |
+
"state_file": "policy_postprocessor_step_0_unnormalizer_processor.safetensors"
|
| 23 |
+
},
|
| 24 |
+
{
|
| 25 |
+
"registry_name": "device_processor",
|
| 26 |
+
"config": {
|
| 27 |
+
"device": "cpu",
|
| 28 |
+
"float_dtype": null
|
| 29 |
+
}
|
| 30 |
+
}
|
| 31 |
+
]
|
| 32 |
+
}
|
policy_postprocessor_step_0_unnormalizer_processor.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5ac4af145fa293fb9282322bee7c87eb369ba8aca3e09dbf1db7600f46142fd5
|
| 3 |
+
size 7552
|
policy_preprocessor.json
ADDED
|
@@ -0,0 +1,64 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"name": "policy_preprocessor",
|
| 3 |
+
"steps": [
|
| 4 |
+
{
|
| 5 |
+
"registry_name": "rename_observations_processor",
|
| 6 |
+
"config": {
|
| 7 |
+
"rename_map": {}
|
| 8 |
+
}
|
| 9 |
+
},
|
| 10 |
+
{
|
| 11 |
+
"registry_name": "to_batch_processor",
|
| 12 |
+
"config": {}
|
| 13 |
+
},
|
| 14 |
+
{
|
| 15 |
+
"registry_name": "device_processor",
|
| 16 |
+
"config": {
|
| 17 |
+
"device": "cuda",
|
| 18 |
+
"float_dtype": null
|
| 19 |
+
}
|
| 20 |
+
},
|
| 21 |
+
{
|
| 22 |
+
"registry_name": "normalizer_processor",
|
| 23 |
+
"config": {
|
| 24 |
+
"eps": 1e-08,
|
| 25 |
+
"features": {
|
| 26 |
+
"observation.state": {
|
| 27 |
+
"type": "STATE",
|
| 28 |
+
"shape": [
|
| 29 |
+
6
|
| 30 |
+
]
|
| 31 |
+
},
|
| 32 |
+
"observation.images.front": {
|
| 33 |
+
"type": "VISUAL",
|
| 34 |
+
"shape": [
|
| 35 |
+
3,
|
| 36 |
+
480,
|
| 37 |
+
640
|
| 38 |
+
]
|
| 39 |
+
},
|
| 40 |
+
"observation.images.wrist": {
|
| 41 |
+
"type": "VISUAL",
|
| 42 |
+
"shape": [
|
| 43 |
+
3,
|
| 44 |
+
480,
|
| 45 |
+
640
|
| 46 |
+
]
|
| 47 |
+
},
|
| 48 |
+
"action": {
|
| 49 |
+
"type": "ACTION",
|
| 50 |
+
"shape": [
|
| 51 |
+
6
|
| 52 |
+
]
|
| 53 |
+
}
|
| 54 |
+
},
|
| 55 |
+
"norm_map": {
|
| 56 |
+
"VISUAL": "MEAN_STD",
|
| 57 |
+
"STATE": "MEAN_STD",
|
| 58 |
+
"ACTION": "MEAN_STD"
|
| 59 |
+
}
|
| 60 |
+
},
|
| 61 |
+
"state_file": "policy_preprocessor_step_3_normalizer_processor.safetensors"
|
| 62 |
+
}
|
| 63 |
+
]
|
| 64 |
+
}
|
policy_preprocessor_step_3_normalizer_processor.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5ac4af145fa293fb9282322bee7c87eb369ba8aca3e09dbf1db7600f46142fd5
|
| 3 |
+
size 7552
|
train_config.json
ADDED
|
@@ -0,0 +1,206 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"dataset": {
|
| 3 |
+
"repo_id": "LightwheelAI/leisaac-pick-orange",
|
| 4 |
+
"root": "/home/david/work/LeIsaac/datasets/raw/leisaac-pick-orange",
|
| 5 |
+
"episodes": null,
|
| 6 |
+
"image_transforms": {
|
| 7 |
+
"enable": false,
|
| 8 |
+
"max_num_transforms": 3,
|
| 9 |
+
"random_order": false,
|
| 10 |
+
"tfs": {
|
| 11 |
+
"brightness": {
|
| 12 |
+
"weight": 1.0,
|
| 13 |
+
"type": "ColorJitter",
|
| 14 |
+
"kwargs": {
|
| 15 |
+
"brightness": [
|
| 16 |
+
0.8,
|
| 17 |
+
1.2
|
| 18 |
+
]
|
| 19 |
+
}
|
| 20 |
+
},
|
| 21 |
+
"contrast": {
|
| 22 |
+
"weight": 1.0,
|
| 23 |
+
"type": "ColorJitter",
|
| 24 |
+
"kwargs": {
|
| 25 |
+
"contrast": [
|
| 26 |
+
0.8,
|
| 27 |
+
1.2
|
| 28 |
+
]
|
| 29 |
+
}
|
| 30 |
+
},
|
| 31 |
+
"saturation": {
|
| 32 |
+
"weight": 1.0,
|
| 33 |
+
"type": "ColorJitter",
|
| 34 |
+
"kwargs": {
|
| 35 |
+
"saturation": [
|
| 36 |
+
0.5,
|
| 37 |
+
1.5
|
| 38 |
+
]
|
| 39 |
+
}
|
| 40 |
+
},
|
| 41 |
+
"hue": {
|
| 42 |
+
"weight": 1.0,
|
| 43 |
+
"type": "ColorJitter",
|
| 44 |
+
"kwargs": {
|
| 45 |
+
"hue": [
|
| 46 |
+
-0.05,
|
| 47 |
+
0.05
|
| 48 |
+
]
|
| 49 |
+
}
|
| 50 |
+
},
|
| 51 |
+
"sharpness": {
|
| 52 |
+
"weight": 1.0,
|
| 53 |
+
"type": "SharpnessJitter",
|
| 54 |
+
"kwargs": {
|
| 55 |
+
"sharpness": [
|
| 56 |
+
0.5,
|
| 57 |
+
1.5
|
| 58 |
+
]
|
| 59 |
+
}
|
| 60 |
+
},
|
| 61 |
+
"affine": {
|
| 62 |
+
"weight": 1.0,
|
| 63 |
+
"type": "RandomAffine",
|
| 64 |
+
"kwargs": {
|
| 65 |
+
"degrees": [
|
| 66 |
+
-5.0,
|
| 67 |
+
5.0
|
| 68 |
+
],
|
| 69 |
+
"translate": [
|
| 70 |
+
0.05,
|
| 71 |
+
0.05
|
| 72 |
+
]
|
| 73 |
+
}
|
| 74 |
+
}
|
| 75 |
+
}
|
| 76 |
+
},
|
| 77 |
+
"revision": null,
|
| 78 |
+
"use_imagenet_stats": true,
|
| 79 |
+
"video_backend": "pyav",
|
| 80 |
+
"return_uint8": false,
|
| 81 |
+
"streaming": false
|
| 82 |
+
},
|
| 83 |
+
"env": null,
|
| 84 |
+
"policy": {
|
| 85 |
+
"type": "act",
|
| 86 |
+
"n_obs_steps": 1,
|
| 87 |
+
"input_features": {
|
| 88 |
+
"observation.state": {
|
| 89 |
+
"type": "STATE",
|
| 90 |
+
"shape": [
|
| 91 |
+
6
|
| 92 |
+
]
|
| 93 |
+
},
|
| 94 |
+
"observation.images.front": {
|
| 95 |
+
"type": "VISUAL",
|
| 96 |
+
"shape": [
|
| 97 |
+
3,
|
| 98 |
+
480,
|
| 99 |
+
640
|
| 100 |
+
]
|
| 101 |
+
},
|
| 102 |
+
"observation.images.wrist": {
|
| 103 |
+
"type": "VISUAL",
|
| 104 |
+
"shape": [
|
| 105 |
+
3,
|
| 106 |
+
480,
|
| 107 |
+
640
|
| 108 |
+
]
|
| 109 |
+
}
|
| 110 |
+
},
|
| 111 |
+
"output_features": {
|
| 112 |
+
"action": {
|
| 113 |
+
"type": "ACTION",
|
| 114 |
+
"shape": [
|
| 115 |
+
6
|
| 116 |
+
]
|
| 117 |
+
}
|
| 118 |
+
},
|
| 119 |
+
"device": "cuda",
|
| 120 |
+
"use_amp": false,
|
| 121 |
+
"use_peft": false,
|
| 122 |
+
"push_to_hub": false,
|
| 123 |
+
"repo_id": null,
|
| 124 |
+
"private": null,
|
| 125 |
+
"tags": null,
|
| 126 |
+
"license": null,
|
| 127 |
+
"pretrained_path": null,
|
| 128 |
+
"chunk_size": 100,
|
| 129 |
+
"n_action_steps": 100,
|
| 130 |
+
"normalization_mapping": {
|
| 131 |
+
"VISUAL": "MEAN_STD",
|
| 132 |
+
"STATE": "MEAN_STD",
|
| 133 |
+
"ACTION": "MEAN_STD"
|
| 134 |
+
},
|
| 135 |
+
"vision_backbone": "resnet18",
|
| 136 |
+
"pretrained_backbone_weights": "ResNet18_Weights.IMAGENET1K_V1",
|
| 137 |
+
"replace_final_stride_with_dilation": false,
|
| 138 |
+
"pre_norm": false,
|
| 139 |
+
"dim_model": 512,
|
| 140 |
+
"n_heads": 8,
|
| 141 |
+
"dim_feedforward": 3200,
|
| 142 |
+
"feedforward_activation": "relu",
|
| 143 |
+
"n_encoder_layers": 4,
|
| 144 |
+
"n_decoder_layers": 1,
|
| 145 |
+
"use_vae": true,
|
| 146 |
+
"latent_dim": 32,
|
| 147 |
+
"n_vae_encoder_layers": 4,
|
| 148 |
+
"temporal_ensemble_coeff": null,
|
| 149 |
+
"dropout": 0.1,
|
| 150 |
+
"kl_weight": 10.0,
|
| 151 |
+
"optimizer_lr": 1e-05,
|
| 152 |
+
"optimizer_weight_decay": 0.0001,
|
| 153 |
+
"optimizer_lr_backbone": 1e-05
|
| 154 |
+
},
|
| 155 |
+
"output_dir": "/home/david/work/LeIsaac/outputs/act-leisaac-pick-orange",
|
| 156 |
+
"job_name": "act-leisaac-pick-orange",
|
| 157 |
+
"resume": false,
|
| 158 |
+
"seed": 1000,
|
| 159 |
+
"cudnn_deterministic": false,
|
| 160 |
+
"num_workers": 4,
|
| 161 |
+
"batch_size": 8,
|
| 162 |
+
"prefetch_factor": 4,
|
| 163 |
+
"persistent_workers": true,
|
| 164 |
+
"steps": 10000,
|
| 165 |
+
"eval_freq": 20000,
|
| 166 |
+
"log_freq": 200,
|
| 167 |
+
"tolerance_s": 0.0001,
|
| 168 |
+
"save_checkpoint": true,
|
| 169 |
+
"save_freq": 2000,
|
| 170 |
+
"use_policy_training_preset": true,
|
| 171 |
+
"optimizer": {
|
| 172 |
+
"type": "adamw",
|
| 173 |
+
"lr": 1e-05,
|
| 174 |
+
"weight_decay": 0.0001,
|
| 175 |
+
"grad_clip_norm": 10.0,
|
| 176 |
+
"betas": [
|
| 177 |
+
0.9,
|
| 178 |
+
0.999
|
| 179 |
+
],
|
| 180 |
+
"eps": 1e-08
|
| 181 |
+
},
|
| 182 |
+
"scheduler": null,
|
| 183 |
+
"eval": {
|
| 184 |
+
"n_episodes": 50,
|
| 185 |
+
"batch_size": 22,
|
| 186 |
+
"use_async_envs": true
|
| 187 |
+
},
|
| 188 |
+
"wandb": {
|
| 189 |
+
"enable": false,
|
| 190 |
+
"disable_artifact": false,
|
| 191 |
+
"project": "lerobot",
|
| 192 |
+
"entity": null,
|
| 193 |
+
"notes": null,
|
| 194 |
+
"run_id": null,
|
| 195 |
+
"mode": null,
|
| 196 |
+
"add_tags": true
|
| 197 |
+
},
|
| 198 |
+
"peft": null,
|
| 199 |
+
"use_rabc": false,
|
| 200 |
+
"rabc_progress_path": null,
|
| 201 |
+
"rabc_kappa": 0.01,
|
| 202 |
+
"rabc_epsilon": 1e-06,
|
| 203 |
+
"rabc_head_mode": "sparse",
|
| 204 |
+
"rename_map": {},
|
| 205 |
+
"checkpoint_path": null
|
| 206 |
+
}
|