mischievers
/

openfront-rl-multiagent

+---
+license: mit
+tags:
+  - reinforcement-learning
+  - ppo
+  - openfront
+  - game-ai
+---
+# OpenFront RL Agent
+PPO-trained agent for [OpenFront.io](https://openfront.io), a multiplayer territory control game.
+## Training Details
+- **Algorithm:** PPO (Proximal Policy Optimization)
+- **Architecture:** Actor-Critic with shared backbone (512→512→256)
+- **Observation dim:** 96
+- **Max neighbors:** 16
+- **Maps:** plains, big_plains, ocean_and_land, half_land_half_ocean (random per episode)
+- **Opponents:** N/A Easy bots
+- **Parallel envs:** 16
+- **Learning rate:** 0.00034
+- **Rollout steps:** 1024
+- **Updates trained:** 660
+- **Global steps:** 86507520
+- **Best mean reward:** -0.06284408122301102
+## Final Training Metrics
+- **Mean reward:** -0.5554914677888155
+- **Mean episode length:** 7626.04
+- **Loss:** -0.16370002925395966
+## Usage
+```python
+from train import ActorCritic
+import torch
+model = ActorCritic(obs_dim=96, max_neighbors=16, hidden_sizes=[512, 512, 256])
+model.load_state_dict(torch.load("best_model.pt", weights_only=True))
+model.eval()
+```
+## Repository
+Trained from [josh-freeman/openfront-rl](https://github.com/josh-freeman/openfront-rl).