Q-Learning Agent playing FrozenLake-v1 โ
This is a trained Q-Learning agent playing FrozenLake-v1 (4x4 map, non-slippery version). The agent was trained using a custom implementation of the Q-Learning algorithm.
๐ฎ Environment
- Environment:
FrozenLake-v1 - Map Name:
4x4 - Is Slippery:
False(Deterministic)
๐ Evaluation Results
| Metric | Value |
|---|---|
| Mean Reward | 1.00 +/- 0.00 |
| Evaluation Episodes | 100 |
โ๏ธ Hyperparameters
The agent was trained using the following hyperparameters:
- Total Training Episodes: 10,000
- Learning Rate: 0.7
- Gamma (Discount Factor): 0.95
- Max Steps per Episode: 99
- Epsilon (Exploration) Start: 1.0
- Epsilon (Exploration) Min: 0.05
- Decay Rate: 0.0005
๐ Usage
To use this model, you need gymnasium and pickle5 installed. You can load the model and evaluate it using the code below:
import gymnasium as gym
import pickle5 as pickle
import numpy as np
from huggingface_hub import hf_hub_download
# 1. Download the model file from the Hub
repo_id = "Tejas-Anvekar/q-FrozenLake-v1-4x4-noSlippery"
filename = "q-learning.pkl"
pickle_model = hf_hub_download(repo_id=repo_id, filename=filename)
# 2. Load the model configuration and Q-table
with open(pickle_model, 'rb') as f:
model = pickle.load(f)
# 3. Create the environment
# IMPORTANT: Ensure is_slippery is set to False to match the training configuration
env = gym.make(model["env_id"], map_name="4x4", is_slippery=False, render_mode="rgb_array")
# 4. Define the Greedy Policy
def greedy_policy(Qtable, state):
action = np.argmax(Qtable[state][:])
return action
# 5. Evaluate the agent
state, info = env.reset()
terminated = False
truncated = False
total_reward = 0
print("Agent is playing...")
while not terminated and not truncated:
action = greedy_policy(model["qtable"], state)
next_state, reward, terminated, truncated, info = env.step(action)
total_reward += reward
state = next_state
print(f"Game Finished! Total Reward: {total_reward}")
env.close()
Evaluation results
- mean_reward on FrozenLake-v1-4x4-no_slipperyself-reported1.00 +/- 0.00