debatefloor / docs /source /getting_started /plot_01_introduction_quickstart.py
AniketAsla's picture
sync: mirror git d05fcb5 to Space
b4ac377 verified
"""
Introduction & Quick Start
==========================
**Part 1 of 5** in the OpenEnv Getting Started Series
This notebook introduces OpenEnv, explains why it exists, and gets you
running your first environment.
.. note::
**Time**: ~10 minutes | **Difficulty**: Beginner | **GPU Required**: No
What You'll Learn
-----------------
- **What is OpenEnv**: The unified framework for RL environments
- **Why OpenEnv**: How it compares to traditional solutions like Gym
- **RL Basics**: The observe-act-reward loop in 60 seconds
- **Quick Start**: Connect to and interact with your first environment
"""
# %%
# Setup: Enable nested async event loops
# --------------------------------------
#
# This is needed when running in environments like Sphinx-Gallery or Jupyter
# that already have an event loop running.
import nest_asyncio
nest_asyncio.apply()
# %%
# What is OpenEnv?
# ----------------
#
# OpenEnv is a **unified framework for building, sharing, and interacting with
# reinforcement learning environments**. It's a collaborative effort between
# Meta, Hugging Face, Unsloth, GPU Mode, and other industry leaders.
#
# **The Goal**: Make environment creation as easy and standardized as model
# sharing on Hugging Face.
#
# Key Features
# ~~~~~~~~~~~~
#
# - **Standardized API**: Gymnasium-style ``reset()``, ``step()``, ``state()``
# - **Type-Safe**: Full IDE autocomplete and error checking
# - **Containerized**: Environments run in Docker for isolation and reproducibility
# - **Shareable**: Push to Hugging Face Hub with one command
# - **Language-Agnostic**: HTTP/WebSocket API works from any language
# %%
# RL in 60 Seconds
# ----------------
#
# Reinforcement Learning is simpler than you think. It's just a loop:
#
# .. code-block:: text
#
# ┌─────────────────────────────────────────────────────────────┐
# │ THE RL LOOP │
# │ │
# │ ┌─────────┐ ┌─────────────┐ │
# │ │ AGENT │─action─▶│ ENVIRONMENT │ │
# │ │ │◀─reward─│ │ │
# │ │ │◀──obs───│ │ │
# │ └─────────┘ └─────────────┘ │
# │ │
# │ 1. Agent observes the environment │
# │ 2. Agent chooses an action │
# │ 3. Environment returns reward + new observation │
# │ 4. Repeat until done │
# └─────────────────────────────────────────────────────────────┘
#
# In code, it looks like this:
#
# .. code-block:: python
#
# result = env.reset() # Start episode
# while not result.done:
# action = agent.choose(result.observation)
# result = env.step(action) # Take action, get reward
# agent.learn(result.reward)
#
# That's it. That's RL!
# %%
# Why OpenEnv? (vs. Traditional Solutions)
# ----------------------------------------
#
# Traditional RL environments (like OpenAI Gym/Gymnasium) have been the backbone
# of RL research for years. They provide a simple API for interacting with
# environments, and the community has built thousands of environments on top of them.
#
# However, as RL moves from research to production, several challenges emerge:
#
# The Problem with Traditional Approaches
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# 1. **No Type Safety**: Observations are numpy arrays like ``obs[0][3]``. What does
# index 3 mean? You have to read documentation or source code to find out.
#
# 2. **Same-Process Execution**: The environment runs in your training process.
# A bug in the environment can crash your entire training run.
#
# 3. **Dependency Hell**: Sharing environments means copying files and hoping
# the recipient has the same dependencies installed.
#
# 4. **Python Lock-in**: Want to use Rust or C++ for your agent? Too bad—Gym is Python-only.
#
# 5. **"Works on My Machine"**: Environments behave differently on different systems
# due to floating-point differences, library versions, or OS quirks.
#
# How OpenEnv Solves These Problems
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# +------------------+----------------------------------+----------------------------------+
# | Challenge | Traditional (Gym) | OpenEnv |
# +==================+==================================+==================================+
# | **Type Safety** | ``obs[0][3]`` - what is it? | ``obs.info_state`` - IDE knows! |
# +------------------+----------------------------------+----------------------------------+
# | **Isolation** | Same process (can crash) | Docker container (isolated) |
# +------------------+----------------------------------+----------------------------------+
# | **Deployment** | "Works on my machine" | Same container everywhere |
# +------------------+----------------------------------+----------------------------------+
# | **Sharing** | Copy files, manage deps | ``openenv push`` to Hub |
# +------------------+----------------------------------+----------------------------------+
# | **Language** | Python only | Any language (HTTP/WebSocket) |
# +------------------+----------------------------------+----------------------------------+
# | **Scaling** | Single machine | Deploy to Kubernetes |
# +------------------+----------------------------------+----------------------------------+
# | **Debugging** | Cryptic numpy index errors | Clear, typed error messages |
# +------------------+----------------------------------+----------------------------------+
#
# Side-by-Side Code Comparison
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# Let's compare the same workflow in both approaches:
#
# **Traditional Gym approach:**
#
# .. code-block:: python
#
# import gym
# import numpy as np
#
# # Create environment - runs in your process
# env = gym.make("CartPole-v1")
#
# # Reset returns numpy arrays
# obs, info = env.reset()
# # obs = array([0.01, 0.02, -0.03, 0.01])
# # What do these numbers mean? You have to check docs!
#
# # Step returns multiple values
# obs, reward, done, truncated, info = env.step(action)
# # No IDE autocomplete, easy to mix up return values
#
# # If env crashes, your whole training crashes
# # Sharing requires: pip install gym[atari], hope versions match
#
# **OpenEnv approach:**
#
# .. code-block:: python
#
# from openenv import AutoEnv, AutoAction
#
# # Load environment and action classes via auto-discovery
# OpenSpielEnv = AutoEnv.get_env_class("openspiel")
# OpenSpielAction = AutoAction.from_env("openspiel")
#
# # Connect to containerized environment
# with OpenSpielEnv(base_url="http://localhost:8000") as env:
# # Reset returns typed StepResult
# result = env.reset()
# # result.observation.legal_actions - IDE autocompletes!
# # result.observation.info_state - you know exactly what this is
#
# # Step with typed action
# action = OpenSpielAction(action_id=1, game_name="catch")
# result = env.step(action)
# # result.reward, result.done - all typed
#
# # Environment runs in Docker - isolated from your code
# # Share via: openenv push my-env (one command!)
# %%
# Part 1: Environment Setup
# -------------------------
#
# Let's set up our environment. This works in Google Colab, locally, or
# anywhere Python runs.
import subprocess
import sys
from pathlib import Path
# Detect environment
try:
import google.colab
IN_COLAB = True
except ImportError:
IN_COLAB = False
if IN_COLAB:
print("=" * 70)
print(" GOOGLE COLAB DETECTED - Installing OpenEnv...")
print("=" * 70)
# Install OpenEnv
subprocess.run(
[sys.executable, "-m", "pip", "install", "-q", "openenv-core"],
capture_output=True,
)
print(" OpenEnv installed!")
print("=" * 70)
else:
print("=" * 70)
print(" RUNNING LOCALLY")
print("=" * 70)
print()
print("If you haven't installed OpenEnv yet:")
print(" pip install openenv-core")
print()
# Add src to path for local development (when running from docs folder)
src_path = Path.cwd().parent.parent.parent / "src"
if src_path.exists():
sys.path.insert(0, str(src_path))
# Add envs to path
envs_path = Path.cwd().parent.parent.parent / "envs"
if envs_path.exists():
sys.path.insert(0, str(envs_path.parent))
print("=" * 70)
print()
print("Ready to explore OpenEnv!")
# %%
# Part 2: Your First Environment - OpenSpiel
# -------------------------------------------
#
# What is OpenSpiel?
# ~~~~~~~~~~~~~~~~~~
#
# `OpenSpiel <https://github.com/google-deepmind/open_spiel>`_ is an open-source
# collection of **70+ game environments** developed by DeepMind for research in
# reinforcement learning, game theory, and multi-agent systems.
#
# It includes:
#
# - **Classic board games**: Chess, Go, Backgammon, Tic-Tac-Toe
# - **Card games**: Poker variants, Blackjack, Bridge
# - **Simple RL benchmarks**: Catch, Cliff Walking, 2048
# - **Multi-agent games**: Hanabi, Kuhn Poker, Negotiation games
#
# OpenSpiel is widely used in RL research because it provides consistent,
# well-tested implementations with support for both single-player and multi-player
# scenarios.
#
# How OpenSpiel Connects to OpenEnv
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# OpenEnv wraps OpenSpiel games as **containerized, type-safe environments**.
# This means:
#
# 1. You get all the benefits of OpenSpiel's game library
# 2. Plus type-safe Python clients with IDE autocomplete
# 3. Plus Docker isolation for reproducibility
# 4. Plus easy sharing via Hugging Face Hub
#
# Currently, OpenEnv includes wrappers for 6 OpenSpiel games:
#
# +------------------+-------------+------------------------------------------+
# | Game | Players | Description |
# +==================+=============+==========================================+
# | **Catch** | 1 | Catch a falling ball with a paddle |
# +------------------+-------------+------------------------------------------+
# | **2048** | 1 | Slide tiles to combine numbers |
# +------------------+-------------+------------------------------------------+
# | **Blackjack** | 1 | Classic card game against dealer |
# +------------------+-------------+------------------------------------------+
# | **Cliff Walking**| 1 | Navigate a grid while avoiding cliffs |
# +------------------+-------------+------------------------------------------+
# | **Tic-Tac-Toe** | 2 | Classic 3×3 grid game |
# +------------------+-------------+------------------------------------------+
# | **Kuhn Poker** | 2 | Simplified 3-card poker |
# +------------------+-------------+------------------------------------------+
#
# The Catch Game
# ~~~~~~~~~~~~~~
#
# For this tutorial, we'll use **Catch**—one of the simplest RL environments.
# It's perfect for learning because:
#
# - Simple rules (easy to understand)
# - Fast episodes (10 steps each)
# - Clear success metric (did you catch the ball?)
# - Optimal strategy is learnable (move toward the ball)
#
# **Game Rules:**
#
# .. code-block:: text
#
# ⬜ ⬜ 🔴 ⬜ ⬜ <- Ball starts at random column (row 0)
# ⬜ ⬜ ⬜ ⬜ ⬜
# ⬜ ⬜ ⬜ ⬜ ⬜ The ball falls down one row
# ⬜ ⬜ ⬜ ⬜ ⬜ each time step
# ⬜ ⬜ ⬜ ⬜ ⬜
# ⬜ ⬜ ⬜ ⬜ ⬜
# ⬜ ⬜ ⬜ ⬜ ⬜
# ⬜ ⬜ ⬜ ⬜ ⬜
# ⬜ ⬜ ⬜ ⬜ ⬜
# ⬜ ⬜ 🏓 ⬜ ⬜ <- Paddle at bottom (row 9)
#
# - **Grid Size**: 10 rows × 5 columns
# - **Ball**: Starts at a random column in row 0, falls one row per step
# - **Paddle**: Starts at center column, you control it
# - **Episode Length**: 10 steps (ball reaches bottom)
#
# **Actions:**
#
# +------------+------------------+
# | Action ID | Movement |
# +============+==================+
# | 0 | Move LEFT |
# +------------+------------------+
# | 1 | STAY (no move) |
# +------------+------------------+
# | 2 | Move RIGHT |
# +------------+------------------+
#
# **Rewards:**
#
# - **+1.0** if the paddle is in the same column as the ball when it lands
# - **0.0** if you miss the ball
#
# **Optimal Strategy**: Track the ball's column and move toward it. A perfect
# policy wins 100% of the time since the paddle can always reach any column
# in 10 steps (grid is only 5 columns wide).
#
# Importing OpenEnv
# ~~~~~~~~~~~~~~~~~
#
# First, let's import the OpenSpiel environment client and models:
# Real imports from OpenEnv
try:
# Direct imports from the openspiel_env package
from openspiel_env.client import OpenSpielEnv
from openspiel_env.models import OpenSpielAction, OpenSpielObservation, OpenSpielState
OPENENV_AVAILABLE = True
print("✓ OpenEnv imports successful!")
print(f" - OpenSpielEnv: {OpenSpielEnv}")
print(f" - OpenSpielAction: {OpenSpielAction}")
except ImportError as e:
OPENENV_AVAILABLE = False
print(f"✗ OpenEnv not fully installed: {e}")
print(" Run: pip install openenv-core")
print(" And: pip install -e ./envs/openspiel_env")
# %%
# Connecting to an Environment
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# OpenEnv provides three ways to connect to environments:
#
# 1. **From Hugging Face Hub** (auto-downloads and starts container)
# 2. **From Docker image** (uses local image)
# 3. **From URL** (connects to running server)
#
# Let's examine the actual methods available on the client class:
print("=" * 70)
print(" THREE WAYS TO CONNECT")
print("=" * 70)
print()
if OPENENV_AVAILABLE:
# Show actual method signatures from the class
import inspect
print("Connection methods available on OpenSpielEnv:")
print()
# Method 1: from_hub
if hasattr(OpenSpielEnv, "from_hub"):
sig = inspect.signature(OpenSpielEnv.from_hub)
print(f"1. OpenSpielEnv.from_hub{sig}")
print(" → Auto-downloads from Hugging Face, starts container, connects")
print(" Example: env = OpenSpielEnv.from_hub('openenv/openspiel-env')")
print()
# Method 2: from_docker_image
if hasattr(OpenSpielEnv, "from_docker_image"):
sig = inspect.signature(OpenSpielEnv.from_docker_image)
print(f"2. OpenSpielEnv.from_docker_image{sig}")
print(" → Starts container from local image, connects")
print(" Example: env = OpenSpielEnv.from_docker_image('openspiel-env:latest')")
print()
# Method 3: Direct connection
sig = inspect.signature(OpenSpielEnv.__init__)
print(f"3. OpenSpielEnv.__init__{sig}")
print(" → Connects to already-running server")
print(" Example: env = OpenSpielEnv(base_url='http://localhost:8000')")
print()
print("-" * 70)
print("All three give you the same API - just different ways to start!")
else:
print("(OpenEnv not installed - showing expected methods)")
print()
print("1. OpenSpielEnv.from_hub(repo_id, *, use_docker=True, ...)")
print(" → Auto-downloads from Hugging Face, starts container, connects")
print()
print("2. OpenSpielEnv.from_docker_image(image, provider=None, ...)")
print(" → Starts container from local image, connects")
print()
print("3. OpenSpielEnv(base_url, connect_timeout_s=10.0, ...)")
print(" → Connects to already-running server")
# %%
# Part 3: Playing the Catch Game
# ------------------------------
#
# Now let's actually play! This code attempts to connect to a real server.
# If no server is running, we'll show what the interaction looks like.
import random
# Check if we can connect to a server
SERVER_URL = "http://localhost:8000"
SERVER_AVAILABLE = False
if OPENENV_AVAILABLE:
try:
# Try to connect using sync wrapper
env = OpenSpielEnv(base_url=SERVER_URL)
with env.sync() as client:
# Quick test to verify connection
pass
SERVER_AVAILABLE = True
print(f"✓ Connected to server at {SERVER_URL}")
except Exception as e:
print(f"✗ No server running at {SERVER_URL}")
print(f" Error: {e}")
print()
print("To start a server, run one of these:")
print(" docker run -p 8000:8000 openenv/openspiel-env:latest")
print(" # OR")
print(" cd envs/openspiel_env && openenv serve")
# %%
# Playing with a Real Server
# ~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# When connected to a real server, here's how the interaction works:
if OPENENV_AVAILABLE and SERVER_AVAILABLE:
print("=" * 70)
print(" PLAYING CATCH - LIVE!")
print("=" * 70)
env = OpenSpielEnv(base_url=SERVER_URL)
with env.sync() as client:
# Reset to start a new episode
result = client.reset()
print(f"\nEpisode started!")
print(f" Observation type: {type(result.observation).__name__}")
print(f" Legal actions: {result.observation.legal_actions}")
print(f" Done: {result.done}")
# Play until the episode ends
step_count = 0
while not result.done:
# Choose a random action from legal actions
action_id = random.choice(result.observation.legal_actions)
action = OpenSpielAction(action_id=action_id, game_name="catch")
# Take the action
result = client.step(action)
step_count += 1
print(f"\nStep {step_count}:")
print(f" Action: {action_id} ({'LEFT' if action_id == 0 else 'STAY' if action_id == 1 else 'RIGHT'})")
print(f" Reward: {result.reward}")
print(f" Done: {result.done}")
# Get final state
state = client.state()
print(f"\nEpisode complete!")
print(f" Total steps: {state.step_count}")
print(f" Final reward: {result.reward}")
print(f" Result: {'CAUGHT!' if result.reward > 0 else 'MISSED!'}")
else:
# Run a local simulation to demonstrate the gameplay
print("=" * 70)
print(" PLAYING CATCH - LOCAL SIMULATION")
print("=" * 70)
print()
print("No server running - demonstrating with local simulation.")
print("(This shows exactly what happens when playing the real game)")
print()
# Simulate the Catch game locally
GRID_HEIGHT = 10
GRID_WIDTH = 5
# Initialize game state
ball_col = random.randint(0, GRID_WIDTH - 1)
paddle_col = GRID_WIDTH // 2 # Start in center
print(f"Game initialized:")
print(f" Ball starting column: {ball_col}")
print(f" Paddle starting column: {paddle_col}")
print(f" Grid size: {GRID_HEIGHT} rows × {GRID_WIDTH} columns")
print()
# Simulate episode
for step in range(GRID_HEIGHT):
# Create observation (matching OpenSpiel format)
info_state = [0.0] * (GRID_HEIGHT * GRID_WIDTH)
info_state[step * GRID_WIDTH + ball_col] = 1.0 # Ball position
info_state[(GRID_HEIGHT - 1) * GRID_WIDTH + paddle_col] = 1.0 # Paddle
legal_actions = [0, 1, 2] # LEFT, STAY, RIGHT
# Choose random action
action_id = random.choice(legal_actions)
action_name = {0: "LEFT", 1: "STAY", 2: "RIGHT"}[action_id]
# Execute action
old_paddle = paddle_col
if action_id == 0: # LEFT
paddle_col = max(0, paddle_col - 1)
elif action_id == 2: # RIGHT
paddle_col = min(GRID_WIDTH - 1, paddle_col + 1)
print(f"Step {step + 1}: Ball at row {step}, col {ball_col} | "
f"Paddle: {old_paddle}{paddle_col} ({action_name})")
# Determine result
caught = (paddle_col == ball_col)
reward = 1.0 if caught else 0.0
print()
print(f"Episode complete!")
print(f" Ball landed at column: {ball_col}")
print(f" Paddle final column: {paddle_col}")
print(f" Reward: {reward}")
print(f" Result: {'CAUGHT! 🎉' if caught else 'MISSED! 😢'}")
print()
print("-" * 70)
print("This is exactly how the real OpenSpielEnv works,")
print("just running locally instead of via WebSocket to a server.")
# %%
# Part 4: Understanding the Response Types
# ----------------------------------------
#
# OpenEnv uses type-safe models for all interactions. Let's create actual
# instances and examine their attributes:
print("=" * 70)
print(" OPENENV TYPE SYSTEM - ACTUAL INSTANCES")
print("=" * 70)
# Create example instances that match what you'd get from the Catch game
# These are the actual Pydantic models used by OpenEnv
# 1. OpenSpielObservation - what the agent receives after each step
print("\n📦 OpenSpielObservation (returned in StepResult)")
print("-" * 50)
if OPENENV_AVAILABLE:
# OpenSpielObservation was already imported above via auto-discovery
# Create a sample observation like what Catch game returns
sample_observation = OpenSpielObservation(
info_state=[0.0, 0.0, 1.0, 0.0, 0.0] + [0.0] * 45, # Ball at col 2, row 0
legal_actions=[0, 1, 2], # LEFT, STAY, RIGHT
game_phase="playing",
current_player_id=0,
opponent_last_action=None,
)
print(f" info_state: {sample_observation.info_state[:10]}... (length: {len(sample_observation.info_state)})")
print(f" legal_actions: {sample_observation.legal_actions}")
print(f" game_phase: {sample_observation.game_phase!r}")
print(f" current_player_id: {sample_observation.current_player_id}")
print(f" opponent_last_action: {sample_observation.opponent_last_action}")
else:
# Create without imports to show the structure
from dataclasses import dataclass
from typing import List, Optional
@dataclass
class OpenSpielObservation:
info_state: List[float]
legal_actions: List[int]
game_phase: str = "playing"
current_player_id: int = 0
opponent_last_action: Optional[int] = None
sample_observation = OpenSpielObservation(
info_state=[0.0, 0.0, 1.0, 0.0, 0.0] + [0.0] * 45,
legal_actions=[0, 1, 2],
game_phase="playing",
current_player_id=0,
opponent_last_action=None,
)
print(f" info_state: {sample_observation.info_state[:10]}... (length: {len(sample_observation.info_state)})")
print(f" legal_actions: {sample_observation.legal_actions}")
print(f" game_phase: {sample_observation.game_phase!r}")
print(f" current_player_id: {sample_observation.current_player_id}")
print(f" opponent_last_action: {sample_observation.opponent_last_action}")
# 2. OpenSpielState - the environment's internal state
print("\n📊 OpenSpielState (returned by state())")
print("-" * 50)
if OPENENV_AVAILABLE:
# OpenSpielState was already imported above via auto-discovery
sample_state = OpenSpielState(
game_name="catch",
agent_player=0,
opponent_policy="random",
game_params={"rows": 10, "columns": 5},
num_players=1,
)
print(f" game_name: {sample_state.game_name!r}")
print(f" agent_player: {sample_state.agent_player}")
print(f" opponent_policy: {sample_state.opponent_policy!r}")
print(f" game_params: {sample_state.game_params}")
print(f" num_players: {sample_state.num_players}")
else:
@dataclass
class OpenSpielState:
game_name: str = "catch"
agent_player: int = 0
opponent_policy: str = "random"
game_params: dict = None
num_players: int = 1
sample_state = OpenSpielState(
game_name="catch",
agent_player=0,
opponent_policy="random",
game_params={"rows": 10, "columns": 5},
num_players=1,
)
print(f" game_name: {sample_state.game_name!r}")
print(f" agent_player: {sample_state.agent_player}")
print(f" opponent_policy: {sample_state.opponent_policy!r}")
print(f" game_params: {sample_state.game_params}")
print(f" num_players: {sample_state.num_players}")
# 3. OpenSpielAction - what you send to step()
print("\n🎮 OpenSpielAction (what you send to step())")
print("-" * 50)
if OPENENV_AVAILABLE:
# OpenSpielAction was already imported above via auto-discovery
sample_action = OpenSpielAction(
action_id=1, # STAY
game_name="catch",
game_params={"rows": 10, "columns": 5},
)
print(f" action_id: {sample_action.action_id} # 0=LEFT, 1=STAY, 2=RIGHT")
print(f" game_name: {sample_action.game_name!r}")
print(f" game_params: {sample_action.game_params}")
else:
@dataclass
class OpenSpielAction:
action_id: int
game_name: str = "catch"
game_params: dict = None
sample_action = OpenSpielAction(
action_id=1,
game_name="catch",
game_params={"rows": 10, "columns": 5},
)
print(f" action_id: {sample_action.action_id} # 0=LEFT, 1=STAY, 2=RIGHT")
print(f" game_name: {sample_action.game_name!r}")
print(f" game_params: {sample_action.game_params}")
print("\n" + "=" * 70)
print("These are the actual Pydantic/dataclass models used by OpenEnv.")
print("Type safety helps catch errors before they reach the environment!")
print("=" * 70)
# %%
# Part 5: The Architecture
# ------------------------
#
# OpenEnv uses a client-server architecture:
#
# .. code-block:: text
#
# ┌─────────────────────────────────────────────────────────────┐
# │ YOUR CODE │
# │ │
# │ from openenv import AutoEnv │
# │ OpenSpielEnv = AutoEnv.get_env_class("openspiel") │
# │ env = OpenSpielEnv(base_url="http://localhost:8000") │
# │ result = env.reset() # Sends WebSocket message │
# │ result = env.step(action) # Sends WebSocket message │
# │ │
# └────────────────────────┬────────────────────────────────────┘
# │
# │ WebSocket (persistent connection)
# │
# ┌────────────────────────▼────────────────────────────────────┐
# │ DOCKER CONTAINER │
# │ │
# │ ┌─────────────────────────────────────────────────────┐ │
# │ │ FastAPI Server + Environment Logic │ │
# │ │ - /ws (WebSocket endpoint) │ │
# │ │ - Handles reset(), step(), state() │ │
# │ │ - Runs the actual game simulation │ │
# │ └─────────────────────────────────────────────────────┘ │
# │ │
# │ Isolated • Reproducible • Scalable │
# └─────────────────────────────────────────────────────────────┘
#
# **Key insight**: You never deal with HTTP/WebSocket directly.
# The OpenEnv client handles all the networking!
# %%
# Summary
# -------
#
# In this notebook, you learned:
#
# **What OpenEnv Is:**
#
# - A unified framework for RL environments
# - Containerized, type-safe, and shareable
#
# **Why Use OpenEnv:**
#
# - Type safety with IDE autocomplete
# - Isolated Docker containers
# - Easy sharing via Hugging Face Hub
#
# **How to Use It:**
#
# - ``env.reset()`` - Start a new episode
# - ``env.step(action)`` - Take an action
# - ``env.state()`` - Get current state
#
# Next Steps
# ----------
#
# **Continue to Notebook 2: Using Environments**
#
# In the next notebook, you'll:
#
# - Explore all available OpenEnv environments
# - Create different AI policies
# - Run evaluations and compare performance
# - Work with multi-player games