Update README.md
Browse files
README.md
CHANGED
|
@@ -6,11 +6,11 @@ pipeline_tag: reinforcement-learning
|
|
| 6 |
library_name: stable-baselines3
|
| 7 |
---
|
| 8 |
|
| 9 |
-
Reinforcement-Learning model that utilizes MaskablePPO to guide assembly of a jigsaw puzzle (puzzle pieces with irregular, convex boundaries).
|
| 10 |
|
| 11 |
Code to train/create this custom environment can be found in the "How Puzzling!" [github repo](https://github.com/reeeeemo/how-puzzling).
|
| 12 |
|
| 13 |
-
Initialization Parameters:
|
| 14 |
```
|
| 15 |
def __init__(self, images, seg_model_path, max_steps=100, device="cpu")
|
| 16 |
```
|
|
@@ -25,4 +25,9 @@ def __init__(self, images, seg_model_path, max_steps=100, device="cpu")
|
|
| 25 |
- Max number of steps allowed
|
| 26 |
- Using MaskablePPO solves the issue of infinite actions, but if you decide to use PPO, ensure `max_steps` is set to the max number of puzzle pieces
|
| 27 |
- *device*: **string**
|
| 28 |
-
- CPU or GPU usage
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
library_name: stable-baselines3
|
| 7 |
---
|
| 8 |
|
| 9 |
+
Reinforcement-Learning model *(puzzler-v0)* that utilizes MaskablePPO to guide assembly of a jigsaw puzzle (puzzle pieces with irregular, convex boundaries).
|
| 10 |
|
| 11 |
Code to train/create this custom environment can be found in the "How Puzzling!" [github repo](https://github.com/reeeeemo/how-puzzling).
|
| 12 |
|
| 13 |
+
**Initialization Parameters:**
|
| 14 |
```
|
| 15 |
def __init__(self, images, seg_model_path, max_steps=100, device="cpu")
|
| 16 |
```
|
|
|
|
| 25 |
- Max number of steps allowed
|
| 26 |
- Using MaskablePPO solves the issue of infinite actions, but if you decide to use PPO, ensure `max_steps` is set to the max number of puzzle pieces
|
| 27 |
- *device*: **string**
|
| 28 |
+
- CPU or GPU usage
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
Training data can be found in [events.out.tfevents](./events.out.tfevents.0.500k_5_images_norm_rew) using `tensorboard --logdir .` after downloading the repo.
|
| 32 |
+
|
| 33 |
+
The environment was trained with `VecNormalize` from the `stable_baselines3` libary, you can load from [vec_normalize.pkl](./vec_normalize.pkl)
|