--- title: SQL Data Analyst emoji: 📊 colorFrom: gray colorTo: green sdk: docker sdk_version: "0.1" app_file: app.py pinned: false license: mit --- An RL training environment where an AI agent learns to answer business intelligence questions by writing and executing SQL queries against a live database. An RL training environment where an AI agent learns to answer business intelligence questions by writing and executing SQL queries against a live database. ## Motivation Data analysts spend significant time translating business questions into SQL queries. This environment trains agents to do exactly that — iteratively exploring a database schema, writing queries, observing results, and submitting final answers. ## Quick Start ```bash # Install dependencies pip install -r requirements.txt # Run tests pytest tests/ -v ``` ## Observation Space | Field | Type | Description | |-------|------|-------------| | `schema_summary` | string | Compact DB schema (one line per table) | | `question` | string | Natural language business question | | `last_query` | string \| null | Most recent SQL query | | `last_result` | object \| null | Query result: columns, rows (max 50), error | | `last_error` | string \| null | SQL error if last query failed | | `step` | int | Current step number | | `max_steps` | int | Episode step limit | | `hints` | string[] | Progressive hints (revealed after step 5, 10, 15) | | `done` | bool | Whether episode is complete | ## Action Space Agent must submit exactly one of: | Action | Type | Description | |--------|------|-------------| | `sql_query` | string | A SELECT or WITH SQL query to execute | | `submit_answer` | string | Final answer — ends the episode | ## Tasks | Task | Difficulty | Max Steps | Description | |------|------------|-----------|--------------| | `monthly_signups` | Easy | 10 | Count signups in the last 30 days | | `top_revenue_category` | Medium | 15 | Find highest revenue product category in Q3 | | `churn_analysis` | Hard | 20 | Find emails of users who churned after 3 purchases | ## Reward Function Rewards are given at every step (not just episode end): - `+0.15` — Query executes without error - `+0.10` — Query references a relevant table - `+0.05` — Result has at least one row - `+0.05` — Result is a sensible size - `-0.02` per step beyond step 3 (efficiency penalty) - `-0.10` if agent repeats the same query 3+ times - `+0.00–0.60` on final submission (task grader × 0.60) ## Usage ### Python API ```python from env import SQLAnalystEnv, Action env = SQLAnalystEnv(task_id="monthly_signups") result = env.reset() print(result.observation.question) # Agent takes a step result = env.step(Action(sql_query="SELECT COUNT(*) FROM users WHERE created_at >= DATE('now', '-30 days')")) print(result.reward) ``` ### FastAPI Server ```bash python -m uvicorn env.server:app --host 0.0.0.0 --port 7860 ``` REST endpoints: - `POST /reset` — Reset environment - `POST /step` — Execute action - `POST /state` — Get current state - `WebSocket /ws` — WebSocket for low-latency training ### Baseline Inference ```bash export OPENAI_API_KEY=sk-... python baseline/run_baseline.py ``` ### Docker ```bash docker build -t sql-analyst-env . docker run -p 7860:7860 sql-analyst-env ``` ## Tests ```bash pytest tests/ -v ``` - `test_env.py` — OpenEnv contract tests - `test_graders.py` — Task grader unit tests - `test_reward.py` — Reward calculator tests **All 46 tests pass.** ## Baseline Scores | Task | Score | Model | |------|-------|-------| | monthly_signups | ~0.85 | gpt-4o-mini | | top_revenue_category | ~0.65 | gpt-4o-mini | | churn_analysis | ~0.40 | gpt-4o-mini | | **Average** | **~0.63** | gpt-4o-mini | ## File Structure ``` sql-data-analyst/ ├── env/ │ ├── __init__.py │ ├── models.py # Pydantic models │ ├── database.py # SQLite + seeding │ ├── environment.py # Core environment │ ├── reward.py # Reward calculator │ ├── utils.py # Helpers │ ├── server.py # FastAPI server │ └── tasks/ │ ├── __init__.py │ ├── base.py │ ├── easy.py │ ├── medium.py │ └── hard.py ├── baseline/ │ ├── __init__.py │ ├── run_baseline.py │ └── prompts.py ├── tests/ │ ├── __init__.py │ ├── test_env.py │ ├── test_graders.py │ └── test_reward.py ├── openenv.yaml ├── Dockerfile ├── requirements.txt └── README.md ``` ## License MIT