Deep Q-Network (DQN) β€” CartPole-v1

This repository contains a trained Deep Q-Network (DQN) agent for the Gymnasium environment CartPole-v1.


Environment

  • Environment: CartPole-v1
  • State Space: 4-dimensional continuous vector
  • Action Space: 2 discrete actions
  • Goal: Balance the pole for as long as possible

CartPole has a continuous state space, making tabular Q-learning infeasible.


Algorithm β€” Deep Q-Network (DQN)

DQN approximates Q-values using a neural network:

Q(s,a; ΞΈ)

Training target:

y = r + γ max_a' Q(s',a'; θ⁻)

Key components:

  • Policy Network
  • Target Network
  • Experience Replay Buffer
  • MSE loss optimization

Why DQN?

Because CartPole has continuous states:

s ∈ ℝ⁴

A Q-table cannot represent infinite possible states.
A neural network is used for function approximation.


Training Details

  • Learning rate: 1e-3
  • Discount factor: 0.99
  • Batch size: 64
  • Target update frequency: 100 steps
  • Episodes: 500

Replay buffer improves stability and data efficiency.


Performance

The agent learns to consistently balance the pole for long durations.

Training reward improves steadily over episodes.


Visualization

Below is the trained DQN agent:

CartPole DQN


Files

  • cartpole_dqn.pt β†’ Trained PyTorch model
  • cartpole_dqn.gif β†’ Agent demonstration

Summary

This project demonstrates:

  • Function approximation in reinforcement learning
  • Experience replay
  • Target networks for stability
  • Deep reinforcement learning fundamentals

It represents a transition from tabular RL to deep RL.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading