π Q-Learning Agent on Taxi-v3
This repository contains a trained Q-Learning agent that successfully solves the Taxi-v3 environment.
π Model Card
Model Name: q-Taxi-v3
Environment: Taxi-v3
Algorithm: Q-Learning
Performance Metric:
- Mean Reward: Achieves optimal policy performance in evaluation runs
- Verification: Not yet independently verified
π Usage
from huggingface_hub import load_from_hub
import gym
# Load the trained Q-learning model
model = load_from_hub(
repo_id="KraTUZen/q-Taxi-v3",
filename="q-learning.pkl"
)
# Initialize environment
env = gym.make(model["env_id"])
π§ Notes
- The agent is trained on the Taxi-v3 environment, where the goal is to pick up and drop off passengers efficiently.
- The Q-table is serialized in
q-learning.pkl. - You can directly load and evaluate the agent using the provided snippet.
π Repository Structure
q-learning.pklβ Serialized Q-table of the trained agentREADME.mdβ Documentation and usage guide
β Results
The agent consistently learns an optimal policy for navigating the grid, picking up passengers, and dropping them at their destinations with minimal steps.
π Environment Overview
- Grid Size: 5x5
- Objective: Pick up passengers and drop them at designated locations
- Challenges: Efficient navigation, avoiding unnecessary moves, maximizing reward
Evaluation results
- mean_reward on Taxi-v3self-reported7.52 +/- 2.67