YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
FIPER RND-OE Detector β build_block_tower (RL Token, step 9999)
RND-OE (Random Network Distillation β Observation Embedding) novelty detector trained on RL-token embeddings from an OpenPI Pi0-RL policy fine-tuned on the build_block_tower task.
Model Details
- Method: RND-OE
- Embedding source: RL token from OpenPI Pi0-RL (pi0.5)
- Embedding dim: 2048
- RND output dim: 512
- OpenPI config:
pi05_rl_token_build_block_tower - OpenPI checkpoint: step 9999 (
pi05_rl_token_build_block_tower/rlt_v1/9999) - Dataset:
villekuosmanen/build_block_tower(68,997 samples, 100 episodes) - Action space: 7D joint-space
- Action horizon: 50
Training Config
| Parameter | Value |
|---|---|
| Batch size | 32 |
| Epochs | 10 |
| Optimizer | AdamW |
| Learning rate | 1e-4 β 1e-6 (cosine) |
| Weight decay | 1e-5 |
| Validation split | 90/10 (random, seed 42) |
| Calibration quantile | 0.95 |
| Threshold window | 3 |
Training Results
| Epoch | Train Loss | Val Loss |
|---|---|---|
| 1 | 0.000855 | 0.000398 |
| 2 | 0.000328 | 0.000278 |
| 3 | 0.000241 | 0.000215 |
| 4 | 0.000194 | 0.000187 |
| 5 | 0.000166 | 0.000160 |
| 6 | 0.000148 | 0.000145 |
| 7 | 0.000136 | 0.000133 |
| 8 | 0.000127 | 0.000127 |
| 9 | 0.000122 | 0.000123 |
| 10 | 0.000119 | 0.000121 |
Both losses dropped steadily across all 10 epochs with no sign of overfitting; val tracked train closely throughout.
Fitted threshold: 0.000577 (q=0.95, window=3, calibrated over 100 episodes / 62,097 samples)
W&B: training curves
Files
| File | Description | SHA256 |
|---|---|---|
best.pt |
Best model checkpoint (epoch 10) | 0f1073103cbefc7ea20ef9f2d58b27b224b7c7ef7555f283b8d4745de114386b |
latest.pt |
Latest model checkpoint (epoch 10) | 1b80f1c2eac203102b14bb05b31e039713d723148ccf6cdc59a8b8f7d9fcf6e5 |
rnd_oe_detector.pt |
Detector + threshold + calibration | 63e851d9bb680887cdbdfb5ebe1dd541ace9afd6af49a501f20c18ac56441391 |
config.json |
Resolved training config | cbcd7eaf0c611acb898810025e590b56f9d52bddb669e0e4d69b835b9565af8d |
Verify hashes: sha256sum <file>
Usage
Evaluate on held-out episodes:
python -m python.fiper.evaluate_rnd_oe_batch \
--detector-path rnd_oe_detector.pt \
--dataset-repo-id villekuosmanen/build_block_tower \
--openpi-checkpoint-dir <path_to_openpi_checkpoint>/9999 \
--openpi-config-name pi05_rl_token_build_block_tower \
--embedding-variant rl_token \
--episodes-per-dataset 10 \
--output-dir eval_output/
Requires the OpenPI Pi0-RL checkpoint and the alpha-robotics repo with FIPER code.
- Downloads last month
- 11
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support