KraTUZen
/

ppo-PyramidsTraining

@@ -27,29 +27,83 @@ model-index:
       verified: false
 ---
-  # **ppo** Agent playing **Pyramids**
-  This is a trained model of a **ppo** agent playing **Pyramids**
-  using the [Unity ML-Agents Library](https://github.com/Unity-Technologies/ml-agents).
-  ## Usage (with ML-Agents)
-  The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/
-  We wrote a complete tutorial to learn to train your first agent using ML-Agents and publish it to the Hub:
-  - A *short tutorial* where you teach Huggy the Dog 🐶 to fetch the stick and then play with him directly in your
-  browser: https://huggingface.co/learn/deep-rl-course/unitbonus1/introduction
-  - A *longer tutorial* to understand how works ML-Agents:
-  https://huggingface.co/learn/deep-rl-course/unit5/introduction
-  ### Resume the training
-  ```bash
-  mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume
-  ```
-  ### Watch your Agent play
-  You can watch your agent **playing directly in your browser**
-  1. If the environment is part of ML-Agents official environments, go to https://huggingface.co/unity
-  2. Step 1: Find your model_id: KraTUZen/ppo-PyramidsTraining
-  3. Step 2: Select your *.nn /*.onnx file
-  4. Click on Watch the agent play 👀

       verified: false
 ---
+# 🏛️ **PPO Agent on Pyramids**
+This repository contains a trained **Proximal Policy Optimization (PPO)** agent that plays the **Pyramids** environment using the [Unity ML-Agents Library](https://github.com/Unity-Technologies/ml-agents).
+---
+## 📊 Model Card
+**Model Name:** `ppo-PyramidsTraining`
+**Environment:** `Pyramids` (Unity ML-Agents)
+**Algorithm:** PPO (Proximal Policy Optimization)
+**Performance Metric:**
+- Achieves stable performance in navigating and solving pyramid-based tasks
+- Demonstrates convergence to an effective policy
+---
+## 🚀 Usage (with ML-Agents)
+Documentation: [ML-Agents Toolkit Docs](https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/)
+### Resume Training
+```bash
+mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume
+```
+### Load and Run
+```python
+# Example: loading the trained PPO model
+# (requires Unity ML-Agents setup)
+model_id = "KraTUZen/ppo-PyramidsTraining"
+# Select your .nn or .onnx file from the repo
+```
+---
+## 🧠 Notes
+- The agent is trained using **PPO**, a robust on-policy algorithm widely used in Unity ML-Agents.
+- The environment involves **pyramid navigation and puzzle-solving**, requiring precision and strategy.
+- The trained model is stored as `.nn` or `.onnx` files for direct Unity integration.
+---
+## 📂 Repository Structure
+- `Pyramids.nn` / `Pyramids.onnx` → Trained PPO policy
+- `README.md` → Documentation and usage guide
+---
+## ✅ Results
+- The agent learns to navigate pyramid structures and solve tasks efficiently.
+- Demonstrates stable training and effective policy convergence using PPO.
+---
+## 🔎 Environment Overview
+- **Observation Space:** Continuous (agent position, pyramid state, environment features)
+- **Action Space:** Continuous (movement, interaction)
+- **Objective:** Solve pyramid-based tasks and maximize rewards
+- **Reward:** Positive reward for successful task completion, penalties for failures
+---
+## 📚 Learning Highlights
+- **Algorithm:** PPO (Proximal Policy Optimization)
+- **Update Rule:** Clipped surrogate objective to ensure stable updates
+- **Strengths:** Robust, stable, widely used in Unity ML-Agents
+- **Limitations:** Requires careful tuning of hyperparameters (clip ratio, learning rate, batch size)
+---
+## 🎮 Watch Your Agent Play
+You can watch your agent **directly in your browser**:
+1. Visit [Unity ML-Agents on Hugging Face](https://huggingface.co/unity)
+2. Find your model ID: `KraTUZen/ppo-PyramidsTraining`
+3. Select your `.nn` or `.onnx` file
+4. Click **Watch the agent play 👀**