Q-Learning Agent — FrozenLake-v1

This repository contains a trained Q-Learning agent for the Gymnasium environment FrozenLake-v1.

Environment

FrozenLake is a small Markov Decision Process (MDP) where the agent must reach a goal while avoiding holes.

This model uses Tabular Q-Learning, a model-free off-policy reinforcement learning algorithm.

Update rule:

Q(s,a) ← Q(s,a) + α [ r + γ max_a' Q(s',a') − Q(s,a) ]

Where:

Because the environment is discrete and small, Q-values are stored in a Q-table of shape (16 × 4).

The agent learns to maximize expected long-term reward despite stochastic transitions.

Training reward was tracked across episodes.

The agent successfully learns an optimal navigation policy to reach the goal.

Below is the trained agent interacting with the environment:

This project demonstrates:

It serves as a foundational reinforcement learning example.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview