PPO Agent Playing ViZDoom Health Gathering Supreme

This is a trained model using Sample Factory's high-performance PPO implementation to play ViZDoom's Health Gathering Supreme scenario from pixel observations.

The agent learns to survive in a 3D environment by collecting health packs while avoiding acidic floor damage, using only visual input and basic movement controls.

Algorithm: PPO (Sample Factory implementation)
Environment: ViZDoom Health Gathering Supreme
Training: 4,000,000 steps
Observation: RGB pixels (84x84)
Actions: Turn left, turn right, move forward

The agent must learn survival strategies without explicit knowledge of what prolongs its existence, discovering that collecting medkits is essential for survival.

Performance: Mean reward 8.45 ± 2.1

Downloads last month: -

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on doom_health_gathering_supreme
self-reported

8.45 +/- 2.1