PPO Agent Playing ViZDoom Health Gathering Supreme
This is a trained model using Sample Factory's high-performance PPO implementation to play ViZDoom's Health Gathering Supreme scenario from pixel observations.
The agent learns to survive in a 3D environment by collecting health packs while avoiding acidic floor damage, using only visual input and basic movement controls.
Algorithm: PPO (Sample Factory implementation)
Environment: ViZDoom Health Gathering Supreme
Training: 4,000,000 steps
Observation: RGB pixels (84x84)
Actions: Turn left, turn right, move forward
The agent must learn survival strategies without explicit knowledge of what prolongs its existence, discovering that collecting medkits is essential for survival.
Performance: Mean reward 8.45 ± 2.1
- Downloads last month
- -
Evaluation results
- mean_reward on doom_health_gathering_supremeself-reported8.45 +/- 2.1