sql_data_analyst / README.md
YashashMathur's picture
Fix: Add proper YAML config to README
af940ac verified
|
raw
history blame
4.67 kB
metadata
title: SQL Data Analyst
emoji: πŸ“Š
colorFrom: gray
colorTo: green
sdk: docker
sdk_version: '0.1'
app_file: app.py
pinned: false
license: mit

An RL training environment where an AI agent learns to answer business intelligence questions by writing and executing SQL queries against a live database.

An RL training environment where an AI agent learns to answer business intelligence questions by writing and executing SQL queries against a live database.

Motivation

Data analysts spend significant time translating business questions into SQL queries. This environment trains agents to do exactly that β€” iteratively exploring a database schema, writing queries, observing results, and submitting final answers.

Quick Start

# Install dependencies
pip install -r requirements.txt

# Run tests
pytest tests/ -v

Observation Space

Field Type Description
schema_summary string Compact DB schema (one line per table)
question string Natural language business question
last_query string | null Most recent SQL query
last_result object | null Query result: columns, rows (max 50), error
last_error string | null SQL error if last query failed
step int Current step number
max_steps int Episode step limit
hints string[] Progressive hints (revealed after step 5, 10, 15)
done bool Whether episode is complete

Action Space

Agent must submit exactly one of:

Action Type Description
sql_query string A SELECT or WITH SQL query to execute
submit_answer string Final answer β€” ends the episode

Tasks

Task Difficulty Max Steps Description
monthly_signups Easy 10 Count signups in the last 30 days
top_revenue_category Medium 15 Find highest revenue product category in Q3
churn_analysis Hard 20 Find emails of users who churned after 3 purchases

Reward Function

Rewards are given at every step (not just episode end):

  • +0.15 β€” Query executes without error
  • +0.10 β€” Query references a relevant table
  • +0.05 β€” Result has at least one row
  • +0.05 β€” Result is a sensible size
  • -0.02 per step beyond step 3 (efficiency penalty)
  • -0.10 if agent repeats the same query 3+ times
  • +0.00–0.60 on final submission (task grader Γ— 0.60)

Usage

Python API

from env import SQLAnalystEnv, Action

env = SQLAnalystEnv(task_id="monthly_signups")
result = env.reset()
print(result.observation.question)

# Agent takes a step
result = env.step(Action(sql_query="SELECT COUNT(*) FROM users WHERE created_at >= DATE('now', '-30 days')"))
print(result.reward)

FastAPI Server

python -m uvicorn env.server:app --host 0.0.0.0 --port 7860

REST endpoints:

  • POST /reset β€” Reset environment
  • POST /step β€” Execute action
  • POST /state β€” Get current state
  • WebSocket /ws β€” WebSocket for low-latency training

Baseline Inference

export OPENAI_API_KEY=sk-...
python baseline/run_baseline.py

Docker

docker build -t sql-analyst-env .
docker run -p 7860:7860 sql-analyst-env

Tests

pytest tests/ -v
  • test_env.py β€” OpenEnv contract tests
  • test_graders.py β€” Task grader unit tests
  • test_reward.py β€” Reward calculator tests

All 46 tests pass.

Baseline Scores

Task Score Model
monthly_signups ~0.85 gpt-4o-mini
top_revenue_category ~0.65 gpt-4o-mini
churn_analysis ~0.40 gpt-4o-mini
Average ~0.63 gpt-4o-mini

File Structure

sql-data-analyst/
β”œβ”€β”€ env/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ models.py           # Pydantic models
β”‚   β”œβ”€β”€ database.py         # SQLite + seeding
β”‚   β”œβ”€β”€ environment.py      # Core environment
β”‚   β”œβ”€β”€ reward.py           # Reward calculator
β”‚   β”œβ”€β”€ utils.py            # Helpers
β”‚   β”œβ”€β”€ server.py           # FastAPI server
β”‚   └── tasks/
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ base.py
β”‚       β”œβ”€β”€ easy.py
β”‚       β”œβ”€β”€ medium.py
β”‚       └── hard.py
β”œβ”€β”€ baseline/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ run_baseline.py
β”‚   └── prompts.py
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ test_env.py
β”‚   β”œβ”€β”€ test_graders.py
β”‚   └── test_reward.py
β”œβ”€β”€ openenv.yaml
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ requirements.txt
└── README.md

License

MIT