Instructions to use wsagi/DiffusionPolicy-PickOrange with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use wsagi/DiffusionPolicy-PickOrange with LeRobot:
- Notebooks
- Google Colab
- Kaggle
Update README: 3/3 success, project links, screenshot ref
Browse files
README.md
CHANGED
|
@@ -3,138 +3,223 @@ license: apache-2.0
|
|
| 3 |
library_name: lerobot
|
| 4 |
pipeline_tag: robotics
|
| 5 |
tags:
|
| 6 |
-
- robotics
|
| 7 |
- diffusion-policy
|
| 8 |
- lerobot
|
| 9 |
-
- so-101
|
| 10 |
- so101
|
| 11 |
-
- pick-orange
|
| 12 |
- leisaac
|
|
|
|
|
|
|
|
|
|
| 13 |
datasets:
|
| 14 |
- LightwheelAI/leisaac-pick-orange
|
|
|
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
-
# DiffusionPolicy
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
[
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
- **
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
-
|
| 54 |
-
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
-
``
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
```
|
| 78 |
|
| 79 |
-
|
|
|
|
| 80 |
|
| 81 |
-
|
| 82 |
-
patches are required for Diffusion Policy to work through that path (the server
|
| 83 |
-
calls `predict_action_chunk()` instead of `select_action()`):
|
| 84 |
|
| 85 |
-
|
| 86 |
-
stub from the input batch, stacks raw image keys into `OBS_IMAGES`, and feeds
|
| 87 |
-
the current observation into the policy's observation queue (auto-replicates
|
| 88 |
-
to fill `n_obs_steps` on first call). Without this the server hits either
|
| 89 |
-
`torch.stack([])` (empty deque) or `NoneType` (rollout `action=None` populated
|
| 90 |
-
into the action deque).
|
| 91 |
-
2. Server-side traceback logging in `Error in StreamActions` for debuggability.
|
| 92 |
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
`Error in StreamActions: stack expects a non-empty TensorList` and never
|
| 96 |
-
produces an action.
|
| 97 |
|
| 98 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 99 |
|
| 100 |
```bash
|
| 101 |
-
|
| 102 |
-
|
|
|
|
| 103 |
|
| 104 |
-
# 2
|
| 105 |
-
|
| 106 |
-
|
|
|
|
|
|
|
| 107 |
--task=LeIsaac-SO101-PickOrange-v0 \
|
| 108 |
-
--eval_rounds=
|
|
|
|
|
|
|
| 109 |
--policy_type=lerobot-diffusion \
|
| 110 |
--policy_host=127.0.0.1 --policy_port=8080 \
|
| 111 |
-
--
|
| 112 |
-
--
|
| 113 |
-
--
|
| 114 |
-
--
|
| 115 |
```
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 133 |
|
| 134 |
-
## License
|
| 135 |
|
| 136 |
-
|
| 137 |
-
- Dataset: [`LightwheelAI/leisaac-pick-orange`](https://huggingface.co/datasets/LightwheelAI/leisaac-pick-orange)
|
| 138 |
-
(Apache-2.0)
|
| 139 |
-
- Policy class: LeRobot `--policy.type=diffusion`, Columbia Artificial
|
| 140 |
-
Intelligence, Robotics Lab et al.
|
|
|
|
| 3 |
library_name: lerobot
|
| 4 |
pipeline_tag: robotics
|
| 5 |
tags:
|
|
|
|
| 6 |
- diffusion-policy
|
| 7 |
- lerobot
|
|
|
|
| 8 |
- so101
|
|
|
|
| 9 |
- leisaac
|
| 10 |
+
- pick-orange
|
| 11 |
+
- isaac-sim
|
| 12 |
+
- ddim
|
| 13 |
datasets:
|
| 14 |
- LightwheelAI/leisaac-pick-orange
|
| 15 |
+
language:
|
| 16 |
+
- en
|
| 17 |
---
|
| 18 |
|
| 19 |
+
# DiffusionPolicy-PickOrange
|
| 20 |
+
|
| 21 |
+
针对 [LeIsaac SO-101 PickOrange](https://github.com/LightwheelAI/leisaac) 任务**从头训练**的 LeRobot Diffusion Policy(~267M,UNet 1D + ResNet18 vision encoder),**已 hot-swap 到 DDIM 32-step inference**(不重训,直接改 ckpt `config.json`)。
|
| 22 |
+
_A LeRobot Diffusion Policy (~267M, UNet 1D + ResNet18 vision encoder) **trained from scratch** on the [LeIsaac SO-101 PickOrange](https://github.com/LightwheelAI/leisaac) task. **DDIM 32-step inference hot-swapped into the ckpt config** without retraining._
|
| 23 |
+
|
| 24 |
+

|
| 25 |
+
|
| 26 |
+
**🔗 项目仓库 / Project repos**:
|
| 27 |
+
- [vitorcen/isaaclab-experience](https://github.com/vitorcen/isaaclab-experience) — Isaac Lab + LeIsaac 多策略横评(parent project)
|
| 28 |
+
- [vitorcen/LeIsaac](https://github.com/vitorcen/LeIsaac) — LeIsaac fork(训练脚本 + 设计文档 / training scripts + design docs)
|
| 29 |
+
|
| 30 |
+
## TL;DR
|
| 31 |
+
|
| 32 |
+
- **任务 / Task**:`Pick up the orange and place it on the plate` — SO-101 单臂依次夹起 3 颗橙子放盘子。
|
| 33 |
+
_Single-arm SO-101 picks 3 oranges sequentially and places each on a plate._
|
| 34 |
+
- **数据集 / Dataset**:[`LightwheelAI/leisaac-pick-orange`](https://huggingface.co/datasets/LightwheelAI/leisaac-pick-orange) — 60 episode 遥操示范。
|
| 35 |
+
- **架构 / Architecture**:Diffusion Policy(UNet 1D denoiser + ResNet18 双相机 vision encoder + 6 DOF state input → 8-step action chunk)。
|
| 36 |
+
- **训练 / Training**:100k step,~1.07 GB model.safetensors,DDPM 100-step 训练。
|
| 37 |
+
- **推理 hot-swap / Inference hot-swap**:`config.json` 改 `noise_scheduler_type: DDPM → DDIM` + `num_inference_steps: null → 32`,**不重训**。inference latency **393 ms → 147 ms / chunk**,slowdown 2.96x → **1.1x** 实时跑得动。
|
| 38 |
+
_Edit `config.json`: `noise_scheduler_type: DDPM → DDIM` + `num_inference_steps: null → 32` — **no retraining**. Inference latency drops 393 → 147 ms/chunk, slowdown 2.96x → 1.1x, real-time on RTX 4090._
|
| 39 |
+
- **评测 / Eval**:Isaac Sim 5.1 + LeIsaac,**多轮 eval 见到 0/3 ~ 3/3 全谱概率分布,部分轮能完整放完 3 颗**。Diffusion 采样自带 stochasticity,需多轮平均才有意义。
|
| 40 |
+
_Probabilistic outcomes across runs — full distribution from 0/3 to 3/3 observed, with **some rounds completing all 3 oranges**. Diffusion sampling is inherently stochastic; multi-round averaging required for meaningful comparison._
|
| 41 |
+
|
| 42 |
+
## 模型亮点
|
| 43 |
+
_Highlights_
|
| 44 |
+
|
| 45 |
+
- **DDIM scheduler hot-swap 不重训**:DP 论文里 DDPM 100-step 是标配,但 100 步串行采样 → 393 ms/chunk → slowdown 2.96x,4090 实时性吃力。DDIM 是 DDPM 的确定性子集,**可以直接 swap config 不重训权重**。32-step 是 4090 sweet spot。
|
| 46 |
+
_DDIM is a deterministic subset of DDPM; ckpt config can be swapped without retraining. 32 inference steps is the RTX 4090 sweet spot._
|
| 47 |
+
- **概率性完整 3/3 success**:多轮 eval 中**有 round 能完整夹起并放置 3 颗橙子**。比 ACT 的 deterministic 1/1 输出嘈杂,但说明 DP 在 dataset 边界上能触达 task 完整性,不只是"勉强夹起 1 颗"。
|
| 48 |
+
_Some rounds achieve full 3/3 placement, demonstrating DP reaches task completion (not just first-orange grasp) when the diffusion sample lands favorably._
|
| 49 |
+
- **从头训练,无 pretrained vision backbone**:ResNet18 vision encoder 是 LeRobot diffusion 默认 from-scratch 设置,没用 ImageNet pretrain。60 episode 数据撑起一个 visuomotor 任务的极限测试。
|
| 50 |
+
|
| 51 |
+
## 训练配方
|
| 52 |
+
_Training recipe_
|
| 53 |
+
|
| 54 |
+
| 项 / Item | 值 / Value |
|
| 55 |
+
|---|---|
|
| 56 |
+
| Dataset | `LightwheelAI/leisaac-pick-orange` (60 ep, dual-cam 480×640 RGB + 6 DOF state, 30 Hz) |
|
| 57 |
+
| Policy | `diffusion` (LeRobot 实现 / LeRobot impl.) |
|
| 58 |
+
| Vision encoder | ResNet18(from scratch / no ImageNet pretrain) |
|
| 59 |
+
| Action head | UNet 1D denoiser |
|
| 60 |
+
| `n_action_steps` (输出 / output chunk) | 8 |
|
| 61 |
+
| Noise scheduler (训练 / training) | DDPM, 100 steps |
|
| 62 |
+
| Noise scheduler (推理 / inference) | **DDIM, 32 steps**(hot-swapped post-training) |
|
| 63 |
+
| Steps | 100,000 |
|
| 64 |
+
| Optimizer | AdamW |
|
| 65 |
+
| Hardware | RTX 4090 (24 GB) |
|
| 66 |
+
| Recipe credit | LeRobot diffusion baseline, [Diffusion Policy paper (Chi et al. 2023)](https://diffusion-policy.cs.columbia.edu/) |
|
| 67 |
+
|
| 68 |
+
训练入口脚本(在我们的 LeIsaac fork):[`scripts/training/diffusion_policy/train.sh`](https://github.com/vitorcen/LeIsaac/blob/main/scripts/training/diffusion_policy/train.sh)。
|
| 69 |
+
_Training entrypoint in our fork: [`scripts/training/diffusion_policy/train.sh`](https://github.com/vitorcen/LeIsaac/blob/main/scripts/training/diffusion_policy/train.sh)._
|
| 70 |
+
|
| 71 |
+
## 评测结果
|
| 72 |
+
_Eval results_
|
| 73 |
+
|
| 74 |
+
测试环境 / Test setup:Isaac Sim 5.1,task `LeIsaac-SO101-PickOrange-v0`,`episode_length_s=120`,`step_hz=60`(DP 训练时 sim rate),dual-cam 观测,`policy_action_horizon=16`。
|
| 75 |
+
_Test setup: Isaac Sim 5.1, dual-cam observation, `step_hz=60` matching training, `policy_action_horizon=16`._
|
| 76 |
+
|
| 77 |
+
| 配置 / Config | 推理延迟 | 观察到的结果分布 | 备注 |
|
| 78 |
+
|---|---|---|---|
|
| 79 |
+
| DDPM 100-step (无 swap) | 393 ms/chunk, 2.96x slowdown | ⚠️ 多次 timeout | 实时性吃力,运动严重滞后 |
|
| 80 |
+
| **DDIM 32-step (本 ckpt 默认)** | **147 ms/chunk, 1.1x slowdown** | **0/3 / 1/3 / 2/3 / 3/3 全谱出现** | 部分轮能完整放完 3 颗 ✅ |
|
| 81 |
+
|
| 82 |
+
**关键观察 / Key observations**:
|
| 83 |
+
|
| 84 |
+
1. **Diffusion sampling 是 stochastic**:同 ckpt 同 config,每次推理从不同噪声起步 → 同 episode 跑多次结果不同。**这是架构特性,不是 bug**。
|
| 85 |
+
_Stochastic by design: same ckpt + config gives different outcomes per run due to noise initialization._
|
| 86 |
+
2. **部分轮 3/3 完整 success**:证明 DP 在 dataset 60-ep 边界内能 reach task completion,不只是单颗 grasp。
|
| 87 |
+
_Some rounds achieve full 3/3 — DP can reach task completion within the 60-episode dataset boundary._
|
| 88 |
+
3. **结果分布偏斜**:第 1 颗 success rate 远高于第 3 颗(共同 dataset OOD ceiling,与 ACT / SmolVLA / π0.5 一致)。
|
| 89 |
+
_Distribution is skewed: 1st-orange success rate >> 3rd-orange. Shared dataset OOD ceiling with ACT / SmolVLA / π0.5._
|
| 90 |
+
|
| 91 |
+
**严谨 success rate 估计 / Rigorous estimate**:需 `eval_rounds=10` 及以上多 round 平均才能定量。单 sample 误差大,**不要**用单 round 推论。
|
| 92 |
+
_Rigorous comparison requires `eval_rounds=10+`. Single-round inferences are misleading._
|
| 93 |
+
|
| 94 |
+
## ⚠️ 推理关键配置 / Critical inference setting
|
| 95 |
+
|
| 96 |
+
### 1. DDIM hot-swap(已应用于本 ckpt)
|
| 97 |
+
_DDIM hot-swap (already applied in this ckpt)_
|
| 98 |
+
|
| 99 |
+
`config.json` 中的关键字段(本 repo 已设置):
|
| 100 |
+
_Key fields in `config.json` (already configured in this repo):_
|
| 101 |
+
|
| 102 |
+
```json
|
| 103 |
+
{
|
| 104 |
+
"noise_scheduler_type": "DDIM",
|
| 105 |
+
"num_inference_steps": 32
|
| 106 |
+
}
|
| 107 |
+
```
|
| 108 |
|
| 109 |
+
`config.json.bak` 保留原始 DDPM 设定,可对比。
|
| 110 |
+
_`config.json.bak` keeps the original DDPM settings for reference._
|
| 111 |
+
|
| 112 |
+
### 2. DDIM 步数按 GPU 反推 / Per-GPU DDIM step calibration
|
| 113 |
+
|
| 114 |
+
RTX 4090 + Isaac Sim 实测拟合:
|
| 115 |
+
_RTX 4090 + Isaac Sim measured fit:_
|
| 116 |
+
|
| 117 |
+
```
|
| 118 |
+
inference_ms ≈ 36 + n_steps × 3.3
|
| 119 |
+
# overhead 36ms = ResNet18 encode + ZMQ RTT
|
| 120 |
+
# per_step 3.3ms = UNet single denoising on 4090
|
| 121 |
+
|
| 122 |
+
target_inference_ms = effective_chunk × (1000 / step_hz) × safety
|
| 123 |
+
= 8 × 16.67 × 0.85 = 113 ms (60Hz, safety 0.85)
|
| 124 |
+
max_steps = (target - overhead) / per_step ≈ 23 (安全档 / safe)
|
| 125 |
+
= (133 - 36) / 3.3 ≈ 29 (临界档 / critical)
|
| 126 |
```
|
| 127 |
|
| 128 |
+
实测 / Measured on 4090: 30 → 2/3 oranges, **32 → 可见 3/3 完整 success**, 50 → 爆 3D 算力 OOM-like behavior。
|
| 129 |
+
_Tested on 4090: 30 → 2/3, **32 → full 3/3 success observed**, 50 → 3D rendering choked._
|
| 130 |
|
| 131 |
+
**弱卡建议 / Weaker GPU recommendation**: 3060 ~10 ms/step,sweet spot ~ **7-8 steps**。完整 calibration 见 [设计文档](https://github.com/vitorcen/LeIsaac/blob/main/docs/training/dp_inference_speedup_and_dynamic_timeout.html)。
|
|
|
|
|
|
|
| 132 |
|
| 133 |
+
### 3. Action horizon 配置 / Action horizon setting
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
|
| 135 |
+
DP 模型输出 `n_action_steps=8`(固定),所以**客户端 `policy_action_horizon` ≥ 8 时 server 自动截到 8**。设 16 / 32 / 50 等效。
|
| 136 |
+
_DP outputs `n_action_steps=8` (fixed); the server auto-caps client `policy_action_horizon` to 8 when ≥ 8, so 16 / 32 / 50 are equivalent at the client side._
|
|
|
|
|
|
|
| 137 |
|
| 138 |
+
```bash
|
| 139 |
+
--policy_action_horizon=16 # 任意 ≥ 8 都行 / any value ≥ 8 works
|
| 140 |
+
--step_hz=60 # DP 训练 sim rate / DP training sim rate
|
| 141 |
+
--episode_length_s=120
|
| 142 |
+
```
|
| 143 |
+
|
| 144 |
+
## 使用方法
|
| 145 |
+
_Usage_
|
| 146 |
+
|
| 147 |
+
### 1. 启动 LeRobot async policy_server
|
| 148 |
|
| 149 |
```bash
|
| 150 |
+
pip install lerobot
|
| 151 |
+
python -m lerobot.async_inference.policy_server --host 0.0.0.0 --port 8080
|
| 152 |
+
```
|
| 153 |
|
| 154 |
+
### 2. 通过 [vitorcen/LeIsaac](https://github.com/vitorcen/LeIsaac) fork 启动 eval
|
| 155 |
+
|
| 156 |
+
```bash
|
| 157 |
+
cd LeIsaac
|
| 158 |
+
bash scripts/evaluation/run_eval.sh -- \
|
| 159 |
--task=LeIsaac-SO101-PickOrange-v0 \
|
| 160 |
+
--eval_rounds=10 \
|
| 161 |
+
--episode_length_s=120 \
|
| 162 |
+
--step_hz=60 \
|
| 163 |
--policy_type=lerobot-diffusion \
|
| 164 |
--policy_host=127.0.0.1 --policy_port=8080 \
|
| 165 |
+
--policy_checkpoint_path=wsagi/DiffusionPolicy-PickOrange \
|
| 166 |
+
--policy_action_horizon=16 \
|
| 167 |
+
--policy_language_instruction='Pick up the orange and place it on the plate' \
|
| 168 |
+
--device=cuda --enable_cameras
|
| 169 |
```
|
| 170 |
|
| 171 |
+
建议 `eval_rounds=10` 求 success rate 平均(DP 是 stochastic,单 sample 容易误判)。
|
| 172 |
+
_Use `eval_rounds=10` to average success rate (DP is stochastic; single samples mislead)._
|
| 173 |
+
|
| 174 |
+
## 局限性
|
| 175 |
+
_Limitations_
|
| 176 |
+
|
| 177 |
+
- **Stochastic success**:每次 diffusion 采样初值不同,相同 ckpt 同 config 也会有 run-to-run 差异。**不建议**用单 round 结论判断模型好坏。
|
| 178 |
+
_Stochastic outcomes: each diffusion sampling pass starts from different noise; same ckpt + config gives run-to-run variance. Single-round conclusions are misleading._
|
| 179 |
+
- **第 2 / 3 颗 dataset OOD**:与 ACT / SmolVLA / π0.5 共同 ceiling — dataset 60 ep × 每集 1 次"放第 N 颗"演示,第 2/3 颗 state coverage 稀疏。即便 DDIM 32-step 解锁实时性,**第 3 颗的成功率仍随颗数衰减**。
|
| 180 |
+
_Shared 2nd/3rd-orange OOD ceiling. Even with DDIM-32 unlocking realtime, 3rd-orange success rate drops monotonically._
|
| 181 |
+
- **GPU bound**:DDIM step 数与 GPU 算力强耦合。本 ckpt 默认 32-step 是 4090 优化值;3060/3070 上需降到 ~10 step(性能下降 + 可能再损 success rate)。
|
| 182 |
+
_GPU-bound: DDIM steps are tightly coupled to GPU compute. The 32-step default is RTX 4090-optimized; weaker GPUs need ~10 steps (with quality tradeoff)._
|
| 183 |
+
- **无图像增强、无 domain randomization**:sim-only ckpt,真机迁移可能弱。
|
| 184 |
+
_No image augmentation or domain randomization → real-world transfer is likely weak._
|
| 185 |
+
|
| 186 |
+
## 相关
|
| 187 |
+
_Related_
|
| 188 |
+
|
| 189 |
+
- 同任务对照 / Same-task comparisons:
|
| 190 |
+
- [`wsagi/ACT-PickOrange`](https://huggingface.co/wsagi/ACT-PickOrange) — 自训 ACT (~80M),1/1 deterministic success @ horizon=32
|
| 191 |
+
- [`shadowHokage/act_policy`](https://huggingface.co/shadowHokage/act_policy) — 社区 ACT,1/1 (deterministic)
|
| 192 |
+
- [`LightwheelAI/leisaac-pick-orange-v0`](https://huggingface.co/LightwheelAI/leisaac-pick-orange-v0) — GR00T N1.5 SOTA (~3B),~30s 完成 3 颗
|
| 193 |
+
- 完整训练 + eval 配方:[vitorcen/LeIsaac](https://github.com/vitorcen/LeIsaac) fork
|
| 194 |
+
- 设计文档 / Design doc:[`docs/training/dp_inference_speedup_and_dynamic_timeout.html`](https://github.com/vitorcen/LeIsaac/blob/main/docs/training/dp_inference_speedup_and_dynamic_timeout.html) — DDIM swap + dynamic timeout 完整 postmortem(含 SVG 拟合曲线)
|
| 195 |
+
|
| 196 |
+
## 致谢
|
| 197 |
+
_Acknowledgments_
|
| 198 |
+
|
| 199 |
+
- LeIsaac 团队 + LightwheelAI 提供任务环境和数据集
|
| 200 |
+
- LeRobot 团队提供 Diffusion Policy 实现 + async inference 框架
|
| 201 |
+
- Diffusion Policy 原始论文:[Chi et al. 2023](https://diffusion-policy.cs.columbia.edu/)
|
| 202 |
+
- DDIM scheduler swap inspired by HuggingFace `diffusers` library
|
| 203 |
+
|
| 204 |
+
## 引用
|
| 205 |
+
_Citation_
|
| 206 |
+
|
| 207 |
+
```bibtex
|
| 208 |
+
@inproceedings{chi2023diffusion,
|
| 209 |
+
title={Diffusion Policy: Visuomotor Policy Learning via Action Diffusion},
|
| 210 |
+
author={Chi, Cheng and Feng, Siyuan and Du, Yilun and Xu, Zhenjia and Cousineau, Eric and Burchfiel, Benjamin and Song, Shuran},
|
| 211 |
+
booktitle={Robotics: Science and Systems},
|
| 212 |
+
year={2023}
|
| 213 |
+
}
|
| 214 |
+
|
| 215 |
+
@inproceedings{song2021denoising,
|
| 216 |
+
title={Denoising Diffusion Implicit Models},
|
| 217 |
+
author={Song, Jiaming and Meng, Chenlin and Ermon, Stefano},
|
| 218 |
+
booktitle={International Conference on Learning Representations},
|
| 219 |
+
year={2021}
|
| 220 |
+
}
|
| 221 |
+
```
|
| 222 |
|
| 223 |
+
## License
|
| 224 |
|
| 225 |
+
Apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|