Spaces:

YashashMathur
/

sql_data_analyst

Sleeping

App Files Files Community

sql_data_analyst / README.md

YashashMathur

Fix: Add proper YAML config to README

af940ac verified about 1 month ago

preview code

raw

history blame contribute delete

4.67 kB

	---
	title: SQL Data Analyst
	emoji: 📊
	colorFrom: gray
	colorTo: green
	sdk: docker
	sdk_version: "0.1"
	app_file: app.py
	pinned: false
	license: mit
	---

	An RL training environment where an AI agent learns to answer business intelligence questions by writing and executing SQL queries against a live database.

	An RL training environment where an AI agent learns to answer business intelligence questions by writing and executing SQL queries against a live database.

	## Motivation

	Data analysts spend significant time translating business questions into SQL queries. This environment trains agents to do exactly that — iteratively exploring a database schema, writing queries, observing results, and submitting final answers.

	## Quick Start

	```bash
	# Install dependencies
	pip install -r requirements.txt

	# Run tests
	pytest tests/ -v
	```

	## Observation Space

	\| Field \| Type \| Description \|
	\|-------\|------\|-------------\|
	\| `schema_summary` \| string \| Compact DB schema (one line per table) \|
	\| `question` \| string \| Natural language business question \|
	\| `last_query` \| string \\| null \| Most recent SQL query \|
	\| `last_result` \| object \\| null \| Query result: columns, rows (max 50), error \|
	\| `last_error` \| string \\| null \| SQL error if last query failed \|
	\| `step` \| int \| Current step number \|
	\| `max_steps` \| int \| Episode step limit \|
	\| `hints` \| string[] \| Progressive hints (revealed after step 5, 10, 15) \|
	\| `done` \| bool \| Whether episode is complete \|

	## Action Space

	Agent must submit exactly one of:

	\| Action \| Type \| Description \|
	\|--------\|------\|-------------\|
	\| `sql_query` \| string \| A SELECT or WITH SQL query to execute \|
	\| `submit_answer` \| string \| Final answer — ends the episode \|

	## Tasks

	\| Task \| Difficulty \| Max Steps \| Description \|
	\|------\|------------\|-----------\|--------------\|
	\| `monthly_signups` \| Easy \| 10 \| Count signups in the last 30 days \|
	\| `top_revenue_category` \| Medium \| 15 \| Find highest revenue product category in Q3 \|
	\| `churn_analysis` \| Hard \| 20 \| Find emails of users who churned after 3 purchases \|

	## Reward Function

	Rewards are given at every step (not just episode end):

	- `+0.15` — Query executes without error
	- `+0.10` — Query references a relevant table
	- `+0.05` — Result has at least one row
	- `+0.05` — Result is a sensible size
	- `-0.02` per step beyond step 3 (efficiency penalty)
	- `-0.10` if agent repeats the same query 3+ times
	- `+0.00–0.60` on final submission (task grader × 0.60)

	## Usage

	### Python API

	```python
	from env import SQLAnalystEnv, Action

	env = SQLAnalystEnv(task_id="monthly_signups")
	result = env.reset()
	print(result.observation.question)

	# Agent takes a step
	result = env.step(Action(sql_query="SELECT COUNT(*) FROM users WHERE created_at >= DATE('now', '-30 days')"))
	print(result.reward)
	```

	### FastAPI Server

	```bash
	python -m uvicorn env.server:app --host 0.0.0.0 --port 7860
	```

	REST endpoints:
	- `POST /reset` — Reset environment
	- `POST /step` — Execute action
	- `POST /state` — Get current state
	- `WebSocket /ws` — WebSocket for low-latency training

	### Baseline Inference

	```bash
	export OPENAI_API_KEY=sk-...
	python baseline/run_baseline.py
	```

	### Docker

	```bash
	docker build -t sql-analyst-env .
	docker run -p 7860:7860 sql-analyst-env
	```

	## Tests

	```bash
	pytest tests/ -v
	```

	- `test_env.py` — OpenEnv contract tests
	- `test_graders.py` — Task grader unit tests
	- `test_reward.py` — Reward calculator tests

	All 46 tests pass.

	## Baseline Scores

	\| Task \| Score \| Model \|
	\|------\|-------\|-------\|
	\| monthly_signups \| ~0.85 \| gpt-4o-mini \|
	\| top_revenue_category \| ~0.65 \| gpt-4o-mini \|
	\| churn_analysis \| ~0.40 \| gpt-4o-mini \|
	\| Average \| ~0.63 \| gpt-4o-mini \|

	## File Structure

	```
	sql-data-analyst/
	├── env/
	│ ├── __init__.py
	│ ├── models.py # Pydantic models
	│ ├── database.py # SQLite + seeding
	│ ├── environment.py # Core environment
	│ ├── reward.py # Reward calculator
	│ ├── utils.py # Helpers
	│ ├── server.py # FastAPI server
	│ └── tasks/
	│ ├── __init__.py
	│ ├── base.py
	│ ├── easy.py
	│ ├── medium.py
	│ └── hard.py
	├── baseline/
	│ ├── __init__.py
	│ ├── run_baseline.py
	│ └── prompts.py
	├── tests/
	│ ├── __init__.py
	│ ├── test_env.py
	│ ├── test_graders.py
	│ └── test_reward.py
	├── openenv.yaml
	├── Dockerfile
	├── requirements.txt
	└── README.md
	```

	## License

	MIT