Spaces:

AniketAsla
/

debatefloor

Running

App Files Files Community

debatefloor / docs /source /getting_started /plot_01_introduction_quickstart.py

AniketAsla

sync: mirror git d05fcb5 to Space

b4ac377 verified 12 days ago

raw

history blame contribute delete

30.4 kB

	"""
	Introduction & Quick Start
	==========================

	Part 1 of 5 in the OpenEnv Getting Started Series

	This notebook introduces OpenEnv, explains why it exists, and gets you
	running your first environment.

	.. note::
	Time: ~10 minutes \| Difficulty: Beginner \| GPU Required: No

	What You'll Learn
	-----------------

	- What is OpenEnv: The unified framework for RL environments
	- Why OpenEnv: How it compares to traditional solutions like Gym
	- RL Basics: The observe-act-reward loop in 60 seconds
	- Quick Start: Connect to and interact with your first environment
	"""

	# %%
	# Setup: Enable nested async event loops
	# --------------------------------------
	#
	# This is needed when running in environments like Sphinx-Gallery or Jupyter
	# that already have an event loop running.

	import nest_asyncio
	nest_asyncio.apply()

	# %%
	# What is OpenEnv?
	# ----------------
	#
	# OpenEnv is a **unified framework for building, sharing, and interacting with
	# reinforcement learning environments**. It's a collaborative effort between
	# Meta, Hugging Face, Unsloth, GPU Mode, and other industry leaders.
	#
	# The Goal: Make environment creation as easy and standardized as model
	# sharing on Hugging Face.
	#
	# Key Features
	# ~~~~~~~~~~~~
	#
	# - Standardized API: Gymnasium-style ``reset()``, ``step()``, ``state()``
	# - Type-Safe: Full IDE autocomplete and error checking
	# - Containerized: Environments run in Docker for isolation and reproducibility
	# - Shareable: Push to Hugging Face Hub with one command
	# - Language-Agnostic: HTTP/WebSocket API works from any language

	# %%
	# RL in 60 Seconds
	# ----------------
	#
	# Reinforcement Learning is simpler than you think. It's just a loop:
	#
	# .. code-block:: text
	#
	# ┌─────────────────────────────────────────────────────────────┐
	# │ THE RL LOOP │
	# │ │
	# │ ┌─────────┐ ┌─────────────┐ │
	# │ │ AGENT │─action─▶│ ENVIRONMENT │ │
	# │ │ │◀─reward─│ │ │
	# │ │ │◀──obs───│ │ │
	# │ └─────────┘ └─────────────┘ │
	# │ │
	# │ 1. Agent observes the environment │
	# │ 2. Agent chooses an action │
	# │ 3. Environment returns reward + new observation │
	# │ 4. Repeat until done │
	# └─────────────────────────────────────────────────────────────┘
	#
	# In code, it looks like this:
	#
	# .. code-block:: python
	#
	# result = env.reset() # Start episode
	# while not result.done:
	# action = agent.choose(result.observation)
	# result = env.step(action) # Take action, get reward
	# agent.learn(result.reward)
	#
	# That's it. That's RL!

	# %%
	# Why OpenEnv? (vs. Traditional Solutions)
	# ----------------------------------------
	#
	# Traditional RL environments (like OpenAI Gym/Gymnasium) have been the backbone
	# of RL research for years. They provide a simple API for interacting with
	# environments, and the community has built thousands of environments on top of them.
	#
	# However, as RL moves from research to production, several challenges emerge:
	#
	# The Problem with Traditional Approaches
	# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	#
	# 1. No Type Safety: Observations are numpy arrays like ``obs[0][3]``. What does
	# index 3 mean? You have to read documentation or source code to find out.
	#
	# 2. Same-Process Execution: The environment runs in your training process.
	# A bug in the environment can crash your entire training run.
	#
	# 3. Dependency Hell: Sharing environments means copying files and hoping
	# the recipient has the same dependencies installed.
	#
	# 4. Python Lock-in: Want to use Rust or C++ for your agent? Too bad—Gym is Python-only.
	#
	# 5. "Works on My Machine": Environments behave differently on different systems
	# due to floating-point differences, library versions, or OS quirks.
	#
	# How OpenEnv Solves These Problems
	# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	#
	# +------------------+----------------------------------+----------------------------------+
	# \| Challenge \| Traditional (Gym) \| OpenEnv \|
	# +==================+==================================+==================================+
	# \| Type Safety \| ``obs[0][3]`` - what is it? \| ``obs.info_state`` - IDE knows! \|
	# +------------------+----------------------------------+----------------------------------+
	# \| Isolation \| Same process (can crash) \| Docker container (isolated) \|
	# +------------------+----------------------------------+----------------------------------+
	# \| Deployment \| "Works on my machine" \| Same container everywhere \|
	# +------------------+----------------------------------+----------------------------------+
	# \| Sharing \| Copy files, manage deps \| ``openenv push`` to Hub \|
	# +------------------+----------------------------------+----------------------------------+
	# \| Language \| Python only \| Any language (HTTP/WebSocket) \|
	# +------------------+----------------------------------+----------------------------------+
	# \| Scaling \| Single machine \| Deploy to Kubernetes \|
	# +------------------+----------------------------------+----------------------------------+
	# \| Debugging \| Cryptic numpy index errors \| Clear, typed error messages \|
	# +------------------+----------------------------------+----------------------------------+
	#
	# Side-by-Side Code Comparison
	# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	#
	# Let's compare the same workflow in both approaches:
	#
	# Traditional Gym approach:
	#
	# .. code-block:: python
	#
	# import gym
	# import numpy as np
	#
	# # Create environment - runs in your process
	# env = gym.make("CartPole-v1")
	#
	# # Reset returns numpy arrays
	# obs, info = env.reset()
	# # obs = array([0.01, 0.02, -0.03, 0.01])
	# # What do these numbers mean? You have to check docs!
	#
	# # Step returns multiple values
	# obs, reward, done, truncated, info = env.step(action)
	# # No IDE autocomplete, easy to mix up return values
	#
	# # If env crashes, your whole training crashes
	# # Sharing requires: pip install gym[atari], hope versions match
	#
	# OpenEnv approach:
	#
	# .. code-block:: python
	#
	# from openenv import AutoEnv, AutoAction
	#
	# # Load environment and action classes via auto-discovery
	# OpenSpielEnv = AutoEnv.get_env_class("openspiel")
	# OpenSpielAction = AutoAction.from_env("openspiel")
	#
	# # Connect to containerized environment
	# with OpenSpielEnv(base_url="http://localhost:8000") as env:
	# # Reset returns typed StepResult
	# result = env.reset()
	# # result.observation.legal_actions - IDE autocompletes!
	# # result.observation.info_state - you know exactly what this is
	#
	# # Step with typed action
	# action = OpenSpielAction(action_id=1, game_name="catch")
	# result = env.step(action)
	# # result.reward, result.done - all typed
	#
	# # Environment runs in Docker - isolated from your code
	# # Share via: openenv push my-env (one command!)

	# %%
	# Part 1: Environment Setup
	# -------------------------
	#
	# Let's set up our environment. This works in Google Colab, locally, or
	# anywhere Python runs.

	import subprocess
	import sys
	from pathlib import Path

	# Detect environment
	try:
	import google.colab

	IN_COLAB = True
	except ImportError:
	IN_COLAB = False

	if IN_COLAB:
	print("=" * 70)
	print(" GOOGLE COLAB DETECTED - Installing OpenEnv...")
	print("=" * 70)

	# Install OpenEnv
	subprocess.run(
	[sys.executable, "-m", "pip", "install", "-q", "openenv-core"],
	capture_output=True,
	)
	print(" OpenEnv installed!")
	print("=" * 70)
	else:
	print("=" * 70)
	print(" RUNNING LOCALLY")
	print("=" * 70)
	print()
	print("If you haven't installed OpenEnv yet:")
	print(" pip install openenv-core")
	print()

	# Add src to path for local development (when running from docs folder)
	src_path = Path.cwd().parent.parent.parent / "src"
	if src_path.exists():
	sys.path.insert(0, str(src_path))

	# Add envs to path
	envs_path = Path.cwd().parent.parent.parent / "envs"
	if envs_path.exists():
	sys.path.insert(0, str(envs_path.parent))

	print("=" * 70)

	print()
	print("Ready to explore OpenEnv!")

	# %%
	# Part 2: Your First Environment - OpenSpiel
	# -------------------------------------------
	#
	# What is OpenSpiel?
	# ~~~~~~~~~~~~~~~~~~
	#
	# `OpenSpiel <https://github.com/google-deepmind/open_spiel>`_ is an open-source
	# collection of 70+ game environments developed by DeepMind for research in
	# reinforcement learning, game theory, and multi-agent systems.
	#
	# It includes:
	#
	# - Classic board games: Chess, Go, Backgammon, Tic-Tac-Toe
	# - Card games: Poker variants, Blackjack, Bridge
	# - Simple RL benchmarks: Catch, Cliff Walking, 2048
	# - Multi-agent games: Hanabi, Kuhn Poker, Negotiation games
	#
	# OpenSpiel is widely used in RL research because it provides consistent,
	# well-tested implementations with support for both single-player and multi-player
	# scenarios.
	#
	# How OpenSpiel Connects to OpenEnv
	# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	#
	# OpenEnv wraps OpenSpiel games as containerized, type-safe environments.
	# This means:
	#
	# 1. You get all the benefits of OpenSpiel's game library
	# 2. Plus type-safe Python clients with IDE autocomplete
	# 3. Plus Docker isolation for reproducibility
	# 4. Plus easy sharing via Hugging Face Hub
	#
	# Currently, OpenEnv includes wrappers for 6 OpenSpiel games:
	#
	# +------------------+-------------+------------------------------------------+
	# \| Game \| Players \| Description \|
	# +==================+=============+==========================================+
	# \| Catch \| 1 \| Catch a falling ball with a paddle \|
	# +------------------+-------------+------------------------------------------+
	# \| 2048 \| 1 \| Slide tiles to combine numbers \|
	# +------------------+-------------+------------------------------------------+
	# \| Blackjack \| 1 \| Classic card game against dealer \|
	# +------------------+-------------+------------------------------------------+
	# \| Cliff Walking\| 1 \| Navigate a grid while avoiding cliffs \|
	# +------------------+-------------+------------------------------------------+
	# \| Tic-Tac-Toe \| 2 \| Classic 3×3 grid game \|
	# +------------------+-------------+------------------------------------------+
	# \| Kuhn Poker \| 2 \| Simplified 3-card poker \|
	# +------------------+-------------+------------------------------------------+
	#
	# The Catch Game
	# ~~~~~~~~~~~~~~
	#
	# For this tutorial, we'll use Catch—one of the simplest RL environments.
	# It's perfect for learning because:
	#
	# - Simple rules (easy to understand)
	# - Fast episodes (10 steps each)
	# - Clear success metric (did you catch the ball?)
	# - Optimal strategy is learnable (move toward the ball)
	#
	# Game Rules:
	#
	# .. code-block:: text
	#
	# ⬜ ⬜ 🔴 ⬜ ⬜ <- Ball starts at random column (row 0)
	# ⬜ ⬜ ⬜ ⬜ ⬜
	# ⬜ ⬜ ⬜ ⬜ ⬜ The ball falls down one row
	# ⬜ ⬜ ⬜ ⬜ ⬜ each time step
	# ⬜ ⬜ ⬜ ⬜ ⬜
	# ⬜ ⬜ ⬜ ⬜ ⬜
	# ⬜ ⬜ ⬜ ⬜ ⬜
	# ⬜ ⬜ ⬜ ⬜ ⬜
	# ⬜ ⬜ ⬜ ⬜ ⬜
	# ⬜ ⬜ 🏓 ⬜ ⬜ <- Paddle at bottom (row 9)
	#
	# - Grid Size: 10 rows × 5 columns
	# - Ball: Starts at a random column in row 0, falls one row per step
	# - Paddle: Starts at center column, you control it
	# - Episode Length: 10 steps (ball reaches bottom)
	#
	# Actions:
	#
	# +------------+------------------+
	# \| Action ID \| Movement \|
	# +============+==================+
	# \| 0 \| Move LEFT \|
	# +------------+------------------+
	# \| 1 \| STAY (no move) \|
	# +------------+------------------+
	# \| 2 \| Move RIGHT \|
	# +------------+------------------+
	#
	# Rewards:
	#
	# - +1.0 if the paddle is in the same column as the ball when it lands
	# - 0.0 if you miss the ball
	#
	# Optimal Strategy: Track the ball's column and move toward it. A perfect
	# policy wins 100% of the time since the paddle can always reach any column
	# in 10 steps (grid is only 5 columns wide).
	#
	# Importing OpenEnv
	# ~~~~~~~~~~~~~~~~~
	#
	# First, let's import the OpenSpiel environment client and models:

	# Real imports from OpenEnv
	try:
	# Direct imports from the openspiel_env package
	from openspiel_env.client import OpenSpielEnv
	from openspiel_env.models import OpenSpielAction, OpenSpielObservation, OpenSpielState

	OPENENV_AVAILABLE = True
	print("✓ OpenEnv imports successful!")
	print(f" - OpenSpielEnv: {OpenSpielEnv}")
	print(f" - OpenSpielAction: {OpenSpielAction}")
	except ImportError as e:
	OPENENV_AVAILABLE = False
	print(f"✗ OpenEnv not fully installed: {e}")
	print(" Run: pip install openenv-core")
	print(" And: pip install -e ./envs/openspiel_env")

	# %%
	# Connecting to an Environment
	# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	#
	# OpenEnv provides three ways to connect to environments:
	#
	# 1. From Hugging Face Hub (auto-downloads and starts container)
	# 2. From Docker image (uses local image)
	# 3. From URL (connects to running server)
	#
	# Let's examine the actual methods available on the client class:

	print("=" * 70)
	print(" THREE WAYS TO CONNECT")
	print("=" * 70)
	print()

	if OPENENV_AVAILABLE:
	# Show actual method signatures from the class
	import inspect

	print("Connection methods available on OpenSpielEnv:")
	print()

	# Method 1: from_hub
	if hasattr(OpenSpielEnv, "from_hub"):
	sig = inspect.signature(OpenSpielEnv.from_hub)
	print(f"1. OpenSpielEnv.from_hub{sig}")
	print(" → Auto-downloads from Hugging Face, starts container, connects")
	print(" Example: env = OpenSpielEnv.from_hub('openenv/openspiel-env')")
	print()

	# Method 2: from_docker_image
	if hasattr(OpenSpielEnv, "from_docker_image"):
	sig = inspect.signature(OpenSpielEnv.from_docker_image)
	print(f"2. OpenSpielEnv.from_docker_image{sig}")
	print(" → Starts container from local image, connects")
	print(" Example: env = OpenSpielEnv.from_docker_image('openspiel-env:latest')")
	print()

	# Method 3: Direct connection
	sig = inspect.signature(OpenSpielEnv.__init__)
	print(f"3. OpenSpielEnv.__init__{sig}")
	print(" → Connects to already-running server")
	print(" Example: env = OpenSpielEnv(base_url='http://localhost:8000')")
	print()

	print("-" * 70)
	print("All three give you the same API - just different ways to start!")
	else:
	print("(OpenEnv not installed - showing expected methods)")
	print()
	print("1. OpenSpielEnv.from_hub(repo_id, *, use_docker=True, ...)")
	print(" → Auto-downloads from Hugging Face, starts container, connects")
	print()
	print("2. OpenSpielEnv.from_docker_image(image, provider=None, ...)")
	print(" → Starts container from local image, connects")
	print()
	print("3. OpenSpielEnv(base_url, connect_timeout_s=10.0, ...)")
	print(" → Connects to already-running server")

	# %%
	# Part 3: Playing the Catch Game
	# ------------------------------
	#
	# Now let's actually play! This code attempts to connect to a real server.
	# If no server is running, we'll show what the interaction looks like.

	import random

	# Check if we can connect to a server
	SERVER_URL = "http://localhost:8000"
	SERVER_AVAILABLE = False

	if OPENENV_AVAILABLE:
	try:
	# Try to connect using sync wrapper
	env = OpenSpielEnv(base_url=SERVER_URL)
	with env.sync() as client:
	# Quick test to verify connection
	pass
	SERVER_AVAILABLE = True
	print(f"✓ Connected to server at {SERVER_URL}")
	except Exception as e:
	print(f"✗ No server running at {SERVER_URL}")
	print(f" Error: {e}")
	print()
	print("To start a server, run one of these:")
	print(" docker run -p 8000:8000 openenv/openspiel-env:latest")
	print(" # OR")
	print(" cd envs/openspiel_env && openenv serve")

	# %%
	# Playing with a Real Server
	# ~~~~~~~~~~~~~~~~~~~~~~~~~~
	#
	# When connected to a real server, here's how the interaction works:

	if OPENENV_AVAILABLE and SERVER_AVAILABLE:
	print("=" * 70)
	print(" PLAYING CATCH - LIVE!")
	print("=" * 70)

	env = OpenSpielEnv(base_url=SERVER_URL)
	with env.sync() as client:
	# Reset to start a new episode
	result = client.reset()

	print(f"\nEpisode started!")
	print(f" Observation type: {type(result.observation).__name__}")
	print(f" Legal actions: {result.observation.legal_actions}")
	print(f" Done: {result.done}")

	# Play until the episode ends
	step_count = 0
	while not result.done:
	# Choose a random action from legal actions
	action_id = random.choice(result.observation.legal_actions)
	action = OpenSpielAction(action_id=action_id, game_name="catch")

	# Take the action
	result = client.step(action)
	step_count += 1

	print(f"\nStep {step_count}:")
	print(f" Action: {action_id} ({'LEFT' if action_id == 0 else 'STAY' if action_id == 1 else 'RIGHT'})")
	print(f" Reward: {result.reward}")
	print(f" Done: {result.done}")

	# Get final state
	state = client.state()
	print(f"\nEpisode complete!")
	print(f" Total steps: {state.step_count}")
	print(f" Final reward: {result.reward}")
	print(f" Result: {'CAUGHT!' if result.reward > 0 else 'MISSED!'}")

	else:
	# Run a local simulation to demonstrate the gameplay
	print("=" * 70)
	print(" PLAYING CATCH - LOCAL SIMULATION")
	print("=" * 70)
	print()
	print("No server running - demonstrating with local simulation.")
	print("(This shows exactly what happens when playing the real game)")
	print()

	# Simulate the Catch game locally
	GRID_HEIGHT = 10
	GRID_WIDTH = 5

	# Initialize game state
	ball_col = random.randint(0, GRID_WIDTH - 1)
	paddle_col = GRID_WIDTH // 2 # Start in center

	print(f"Game initialized:")
	print(f" Ball starting column: {ball_col}")
	print(f" Paddle starting column: {paddle_col}")
	print(f" Grid size: {GRID_HEIGHT} rows × {GRID_WIDTH} columns")
	print()

	# Simulate episode
	for step in range(GRID_HEIGHT):
	# Create observation (matching OpenSpiel format)
	info_state = [0.0] * (GRID_HEIGHT * GRID_WIDTH)
	info_state[step * GRID_WIDTH + ball_col] = 1.0 # Ball position
	info_state[(GRID_HEIGHT - 1) * GRID_WIDTH + paddle_col] = 1.0 # Paddle

	legal_actions = [0, 1, 2] # LEFT, STAY, RIGHT

	# Choose random action
	action_id = random.choice(legal_actions)
	action_name = {0: "LEFT", 1: "STAY", 2: "RIGHT"}[action_id]

	# Execute action
	old_paddle = paddle_col
	if action_id == 0: # LEFT
	paddle_col = max(0, paddle_col - 1)
	elif action_id == 2: # RIGHT
	paddle_col = min(GRID_WIDTH - 1, paddle_col + 1)

	print(f"Step {step + 1}: Ball at row {step}, col {ball_col} \| "
	f"Paddle: {old_paddle}→{paddle_col} ({action_name})")

	# Determine result
	caught = (paddle_col == ball_col)
	reward = 1.0 if caught else 0.0

	print()
	print(f"Episode complete!")
	print(f" Ball landed at column: {ball_col}")
	print(f" Paddle final column: {paddle_col}")
	print(f" Reward: {reward}")
	print(f" Result: {'CAUGHT! 🎉' if caught else 'MISSED! 😢'}")
	print()
	print("-" * 70)
	print("This is exactly how the real OpenSpielEnv works,")
	print("just running locally instead of via WebSocket to a server.")

	# %%
	# Part 4: Understanding the Response Types
	# ----------------------------------------
	#
	# OpenEnv uses type-safe models for all interactions. Let's create actual
	# instances and examine their attributes:

	print("=" * 70)
	print(" OPENENV TYPE SYSTEM - ACTUAL INSTANCES")
	print("=" * 70)

	# Create example instances that match what you'd get from the Catch game
	# These are the actual Pydantic models used by OpenEnv

	# 1. OpenSpielObservation - what the agent receives after each step
	print("\n📦 OpenSpielObservation (returned in StepResult)")
	print("-" * 50)

	if OPENENV_AVAILABLE:
	# OpenSpielObservation was already imported above via auto-discovery
	# Create a sample observation like what Catch game returns
	sample_observation = OpenSpielObservation(
	info_state=[0.0, 0.0, 1.0, 0.0, 0.0] + [0.0] * 45, # Ball at col 2, row 0
	legal_actions=[0, 1, 2], # LEFT, STAY, RIGHT
	game_phase="playing",
	current_player_id=0,
	opponent_last_action=None,
	)

	print(f" info_state: {sample_observation.info_state[:10]}... (length: {len(sample_observation.info_state)})")
	print(f" legal_actions: {sample_observation.legal_actions}")
	print(f" game_phase: {sample_observation.game_phase!r}")
	print(f" current_player_id: {sample_observation.current_player_id}")
	print(f" opponent_last_action: {sample_observation.opponent_last_action}")
	else:
	# Create without imports to show the structure
	from dataclasses import dataclass
	from typing import List, Optional

	@dataclass
	class OpenSpielObservation:
	info_state: List[float]
	legal_actions: List[int]
	game_phase: str = "playing"
	current_player_id: int = 0
	opponent_last_action: Optional[int] = None

	sample_observation = OpenSpielObservation(
	info_state=[0.0, 0.0, 1.0, 0.0, 0.0] + [0.0] * 45,
	legal_actions=[0, 1, 2],
	game_phase="playing",
	current_player_id=0,
	opponent_last_action=None,
	)

	print(f" info_state: {sample_observation.info_state[:10]}... (length: {len(sample_observation.info_state)})")
	print(f" legal_actions: {sample_observation.legal_actions}")
	print(f" game_phase: {sample_observation.game_phase!r}")
	print(f" current_player_id: {sample_observation.current_player_id}")
	print(f" opponent_last_action: {sample_observation.opponent_last_action}")

	# 2. OpenSpielState - the environment's internal state
	print("\n📊 OpenSpielState (returned by state())")
	print("-" * 50)

	if OPENENV_AVAILABLE:
	# OpenSpielState was already imported above via auto-discovery
	sample_state = OpenSpielState(
	game_name="catch",
	agent_player=0,
	opponent_policy="random",
	game_params={"rows": 10, "columns": 5},
	num_players=1,
	)

	print(f" game_name: {sample_state.game_name!r}")
	print(f" agent_player: {sample_state.agent_player}")
	print(f" opponent_policy: {sample_state.opponent_policy!r}")
	print(f" game_params: {sample_state.game_params}")
	print(f" num_players: {sample_state.num_players}")
	else:
	@dataclass
	class OpenSpielState:
	game_name: str = "catch"
	agent_player: int = 0
	opponent_policy: str = "random"
	game_params: dict = None
	num_players: int = 1

	sample_state = OpenSpielState(
	game_name="catch",
	agent_player=0,
	opponent_policy="random",
	game_params={"rows": 10, "columns": 5},
	num_players=1,
	)

	print(f" game_name: {sample_state.game_name!r}")
	print(f" agent_player: {sample_state.agent_player}")
	print(f" opponent_policy: {sample_state.opponent_policy!r}")
	print(f" game_params: {sample_state.game_params}")
	print(f" num_players: {sample_state.num_players}")

	# 3. OpenSpielAction - what you send to step()
	print("\n🎮 OpenSpielAction (what you send to step())")
	print("-" * 50)

	if OPENENV_AVAILABLE:
	# OpenSpielAction was already imported above via auto-discovery
	sample_action = OpenSpielAction(
	action_id=1, # STAY
	game_name="catch",
	game_params={"rows": 10, "columns": 5},
	)

	print(f" action_id: {sample_action.action_id} # 0=LEFT, 1=STAY, 2=RIGHT")
	print(f" game_name: {sample_action.game_name!r}")
	print(f" game_params: {sample_action.game_params}")
	else:
	@dataclass
	class OpenSpielAction:
	action_id: int
	game_name: str = "catch"
	game_params: dict = None

	sample_action = OpenSpielAction(
	action_id=1,
	game_name="catch",
	game_params={"rows": 10, "columns": 5},
	)

	print(f" action_id: {sample_action.action_id} # 0=LEFT, 1=STAY, 2=RIGHT")
	print(f" game_name: {sample_action.game_name!r}")
	print(f" game_params: {sample_action.game_params}")

	print("\n" + "=" * 70)
	print("These are the actual Pydantic/dataclass models used by OpenEnv.")
	print("Type safety helps catch errors before they reach the environment!")
	print("=" * 70)

	# %%
	# Part 5: The Architecture
	# ------------------------
	#
	# OpenEnv uses a client-server architecture:
	#
	# .. code-block:: text
	#
	# ┌─────────────────────────────────────────────────────────────┐
	# │ YOUR CODE │
	# │ │
	# │ from openenv import AutoEnv │
	# │ OpenSpielEnv = AutoEnv.get_env_class("openspiel") │
	# │ env = OpenSpielEnv(base_url="http://localhost:8000") │
	# │ result = env.reset() # Sends WebSocket message │
	# │ result = env.step(action) # Sends WebSocket message │
	# │ │
	# └────────────────────────┬────────────────────────────────────┘
	# │
	# │ WebSocket (persistent connection)
	# │
	# ┌────────────────────────▼────────────────────────────────────┐
	# │ DOCKER CONTAINER │
	# │ │
	# │ ┌─────────────────────────────────────────────────────┐ │
	# │ │ FastAPI Server + Environment Logic │ │
	# │ │ - /ws (WebSocket endpoint) │ │
	# │ │ - Handles reset(), step(), state() │ │
	# │ │ - Runs the actual game simulation │ │
	# │ └─────────────────────────────────────────────────────┘ │
	# │ │
	# │ Isolated • Reproducible • Scalable │
	# └─────────────────────────────────────────────────────────────┘
	#
	# Key insight: You never deal with HTTP/WebSocket directly.
	# The OpenEnv client handles all the networking!

	# %%
	# Summary
	# -------
	#
	# In this notebook, you learned:
	#
	# What OpenEnv Is:
	#
	# - A unified framework for RL environments
	# - Containerized, type-safe, and shareable
	#
	# Why Use OpenEnv:
	#
	# - Type safety with IDE autocomplete
	# - Isolated Docker containers
	# - Easy sharing via Hugging Face Hub
	#
	# How to Use It:
	#
	# - ``env.reset()`` - Start a new episode
	# - ``env.step(action)`` - Take an action
	# - ``env.state()`` - Get current state
	#
	# Next Steps
	# ----------
	#
	# Continue to Notebook 2: Using Environments
	#
	# In the next notebook, you'll:
	#
	# - Explore all available OpenEnv environments
	# - Create different AI policies
	# - Run evaluations and compare performance
	# - Work with multi-player games