Spaces:

YashashMathur
/

sql_data_analyst

Sleeping

App Files Files Community

sql_data_analyst / README.md

YashashMathur

Fix: Add proper YAML config to README

af940ac verified about 1 month ago

4.67 kB

title: SQL Data Analyst
emoji: 📊
colorFrom: gray
colorTo: green
sdk: docker
sdk_version: '0.1'
app_file: app.py
pinned: false
license: mit

An RL training environment where an AI agent learns to answer business intelligence questions by writing and executing SQL queries against a live database.

Motivation

Data analysts spend significant time translating business questions into SQL queries. This environment trains agents to do exactly that — iteratively exploring a database schema, writing queries, observing results, and submitting final answers.

Quick Start

# Install dependencies
pip install -r requirements.txt

# Run tests
pytest tests/ -v

Observation Space

Field	Type	Description
`schema_summary`	string	Compact DB schema (one line per table)
`question`	string	Natural language business question
`last_query`	string \| null	Most recent SQL query
`last_result`	object \| null	Query result: columns, rows (max 50), error
`last_error`	string \| null	SQL error if last query failed
`step`	int	Current step number
`max_steps`	int	Episode step limit
`hints`	string[]	Progressive hints (revealed after step 5, 10, 15)
`done`	bool	Whether episode is complete

Action Space

Agent must submit exactly one of:

Action	Type	Description
`sql_query`	string	A SELECT or WITH SQL query to execute
`submit_answer`	string	Final answer — ends the episode

Tasks

Task	Difficulty	Max Steps	Description
`monthly_signups`	Easy	10	Count signups in the last 30 days
`top_revenue_category`	Medium	15	Find highest revenue product category in Q3
`churn_analysis`	Hard	20	Find emails of users who churned after 3 purchases

Reward Function

Rewards are given at every step (not just episode end):

+0.15 — Query executes without error
+0.10 — Query references a relevant table
+0.05 — Result has at least one row
+0.05 — Result is a sensible size
-0.02 per step beyond step 3 (efficiency penalty)
-0.10 if agent repeats the same query 3+ times
+0.00–0.60 on final submission (task grader × 0.60)

Usage

Python API

from env import SQLAnalystEnv, Action

env = SQLAnalystEnv(task_id="monthly_signups")
result = env.reset()
print(result.observation.question)

# Agent takes a step
result = env.step(Action(sql_query="SELECT COUNT(*) FROM users WHERE created_at >= DATE('now', '-30 days')"))
print(result.reward)

FastAPI Server

python -m uvicorn env.server:app --host 0.0.0.0 --port 7860

REST endpoints:

POST /reset — Reset environment
POST /step — Execute action
POST /state — Get current state
WebSocket /ws — WebSocket for low-latency training

Baseline Inference

export OPENAI_API_KEY=sk-...
python baseline/run_baseline.py

Docker

docker build -t sql-analyst-env .
docker run -p 7860:7860 sql-analyst-env

Tests

pytest tests/ -v

test_env.py — OpenEnv contract tests
test_graders.py — Task grader unit tests
test_reward.py — Reward calculator tests

All 46 tests pass.

Baseline Scores

Task	Score	Model
monthly_signups	~0.85	gpt-4o-mini
top_revenue_category	~0.65	gpt-4o-mini
churn_analysis	~0.40	gpt-4o-mini
Average	~0.63	gpt-4o-mini

File Structure

sql-data-analyst/
├── env/
│   ├── __init__.py
│   ├── models.py           # Pydantic models
│   ├── database.py         # SQLite + seeding
│   ├── environment.py      # Core environment
│   ├── reward.py           # Reward calculator
│   ├── utils.py            # Helpers
│   ├── server.py           # FastAPI server
│   └── tasks/
│       ├── __init__.py
│       ├── base.py
│       ├── easy.py
│       ├── medium.py
│       └── hard.py
├── baseline/
│   ├── __init__.py
│   ├── run_baseline.py
│   └── prompts.py
├── tests/
│   ├── __init__.py
│   ├── test_env.py
│   ├── test_graders.py
│   └── test_reward.py
├── openenv.yaml
├── Dockerfile
├── requirements.txt
└── README.md

License

MIT