Reinforce++ Agent playing CartPole-v1

This model uses REINFORCE with a learned baseline (value net), entropy regularization, batch updates, observation normalization, orthogonal initialization, and gradient clipping.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results