reeeemo commited on
Commit
1d53a18
·
verified ·
1 Parent(s): 47092e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -6,11 +6,11 @@ pipeline_tag: reinforcement-learning
6
  library_name: stable-baselines3
7
  ---
8
 
9
- Reinforcement-Learning model that utilizes MaskablePPO to guide assembly of a jigsaw puzzle (puzzle pieces with irregular, convex boundaries).
10
 
11
  Code to train/create this custom environment can be found in the "How Puzzling!" [github repo](https://github.com/reeeeemo/how-puzzling).
12
 
13
- Initialization Parameters:
14
  ```
15
  def __init__(self, images, seg_model_path, max_steps=100, device="cpu")
16
  ```
@@ -25,4 +25,9 @@ def __init__(self, images, seg_model_path, max_steps=100, device="cpu")
25
  - Max number of steps allowed
26
  - Using MaskablePPO solves the issue of infinite actions, but if you decide to use PPO, ensure `max_steps` is set to the max number of puzzle pieces
27
  - *device*: **string**
28
- - CPU or GPU usage
 
 
 
 
 
 
6
  library_name: stable-baselines3
7
  ---
8
 
9
+ Reinforcement-Learning model *(puzzler-v0)* that utilizes MaskablePPO to guide assembly of a jigsaw puzzle (puzzle pieces with irregular, convex boundaries).
10
 
11
  Code to train/create this custom environment can be found in the "How Puzzling!" [github repo](https://github.com/reeeeemo/how-puzzling).
12
 
13
+ **Initialization Parameters:**
14
  ```
15
  def __init__(self, images, seg_model_path, max_steps=100, device="cpu")
16
  ```
 
25
  - Max number of steps allowed
26
  - Using MaskablePPO solves the issue of infinite actions, but if you decide to use PPO, ensure `max_steps` is set to the max number of puzzle pieces
27
  - *device*: **string**
28
+ - CPU or GPU usage
29
+
30
+
31
+ Training data can be found in [events.out.tfevents](./events.out.tfevents.0.500k_5_images_norm_rew) using `tensorboard --logdir .` after downloading the repo.
32
+
33
+ The environment was trained with `VecNormalize` from the `stable_baselines3` libary, you can load from [vec_normalize.pkl](./vec_normalize.pkl)