KraTUZen commited on
Commit
f5cf129
ยท
verified ยท
1 Parent(s): ad76789

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -26
README.md CHANGED
@@ -27,29 +27,83 @@ model-index:
27
  verified: false
28
  ---
29
 
30
- # **ppo** Agent playing **Pyramids**
31
- This is a trained model of a **ppo** agent playing **Pyramids**
32
- using the [Unity ML-Agents Library](https://github.com/Unity-Technologies/ml-agents).
33
-
34
- ## Usage (with ML-Agents)
35
- The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/
36
-
37
- We wrote a complete tutorial to learn to train your first agent using ML-Agents and publish it to the Hub:
38
- - A *short tutorial* where you teach Huggy the Dog ๐Ÿถ to fetch the stick and then play with him directly in your
39
- browser: https://huggingface.co/learn/deep-rl-course/unitbonus1/introduction
40
- - A *longer tutorial* to understand how works ML-Agents:
41
- https://huggingface.co/learn/deep-rl-course/unit5/introduction
42
-
43
- ### Resume the training
44
- ```bash
45
- mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume
46
- ```
47
-
48
- ### Watch your Agent play
49
- You can watch your agent **playing directly in your browser**
50
-
51
- 1. If the environment is part of ML-Agents official environments, go to https://huggingface.co/unity
52
- 2. Step 1: Find your model_id: KraTUZen/ppo-PyramidsTraining
53
- 3. Step 2: Select your *.nn /*.onnx file
54
- 4. Click on Watch the agent play ๐Ÿ‘€
55
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  verified: false
28
  ---
29
 
30
+
31
+ # ๐Ÿ›๏ธ **PPO Agent on Pyramids**
32
+
33
+ This repository contains a trained **Proximal Policy Optimization (PPO)** agent that plays the **Pyramids** environment using the [Unity ML-Agents Library](https://github.com/Unity-Technologies/ml-agents).
34
+
35
+ ---
36
+
37
+ ## ๐Ÿ“Š Model Card
38
+
39
+ **Model Name:** `ppo-PyramidsTraining`
40
+ **Environment:** `Pyramids` (Unity ML-Agents)
41
+ **Algorithm:** PPO (Proximal Policy Optimization)
42
+ **Performance Metric:**
43
+ - Achieves stable performance in navigating and solving pyramid-based tasks
44
+ - Demonstrates convergence to an effective policy
45
+
46
+ ---
47
+
48
+ ## ๐Ÿš€ Usage (with ML-Agents)
49
+
50
+ Documentation: [ML-Agents Toolkit Docs](https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/)
51
+
52
+ ### Resume Training
53
+ ```bash
54
+ mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume
55
+ ```
56
+
57
+ ### Load and Run
58
+ ```python
59
+ # Example: loading the trained PPO model
60
+ # (requires Unity ML-Agents setup)
61
+ model_id = "KraTUZen/ppo-PyramidsTraining"
62
+ # Select your .nn or .onnx file from the repo
63
+ ```
64
+
65
+ ---
66
+
67
+ ## ๐Ÿง  Notes
68
+ - The agent is trained using **PPO**, a robust on-policy algorithm widely used in Unity ML-Agents.
69
+ - The environment involves **pyramid navigation and puzzle-solving**, requiring precision and strategy.
70
+ - The trained model is stored as `.nn` or `.onnx` files for direct Unity integration.
71
+
72
+ ---
73
+
74
+ ## ๐Ÿ“‚ Repository Structure
75
+ - `Pyramids.nn` / `Pyramids.onnx` โ†’ Trained PPO policy
76
+ - `README.md` โ†’ Documentation and usage guide
77
+
78
+ ---
79
+
80
+ ## โœ… Results
81
+ - The agent learns to navigate pyramid structures and solve tasks efficiently.
82
+ - Demonstrates stable training and effective policy convergence using PPO.
83
+
84
+ ---
85
+
86
+ ## ๐Ÿ”Ž Environment Overview
87
+ - **Observation Space:** Continuous (agent position, pyramid state, environment features)
88
+ - **Action Space:** Continuous (movement, interaction)
89
+ - **Objective:** Solve pyramid-based tasks and maximize rewards
90
+ - **Reward:** Positive reward for successful task completion, penalties for failures
91
+
92
+ ---
93
+
94
+ ## ๐Ÿ“š Learning Highlights
95
+ - **Algorithm:** PPO (Proximal Policy Optimization)
96
+ - **Update Rule:** Clipped surrogate objective to ensure stable updates
97
+ - **Strengths:** Robust, stable, widely used in Unity ML-Agents
98
+ - **Limitations:** Requires careful tuning of hyperparameters (clip ratio, learning rate, batch size)
99
+
100
+ ---
101
+
102
+ ## ๐ŸŽฎ Watch Your Agent Play
103
+ You can watch your agent **directly in your browser**:
104
+
105
+ 1. Visit [Unity ML-Agents on Hugging Face](https://huggingface.co/unity)
106
+ 2. Find your model ID: `KraTUZen/ppo-PyramidsTraining`
107
+ 3. Select your `.nn` or `.onnx` file
108
+ 4. Click **Watch the agent play ๐Ÿ‘€**
109
+