rahul2124 commited on
Commit
99aa2be
·
verified ·
1 Parent(s): 72805b8

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +91 -9
README.md CHANGED
@@ -1,3 +1,12 @@
 
 
 
 
 
 
 
 
 
1
  # SQL Arena - OpenEnv Environment
2
 
3
  An interactive SQL query challenge environment where AI agents learn to write SQL
@@ -9,21 +18,94 @@ Text-to-SQL is one of the most valuable capabilities for AI agents:
9
  - Used by data analysts, business users, and developers daily
10
  - Evaluates reasoning, schema understanding, and query composition
11
  - Directly applicable to production AI assistants and copilots
12
- - SQL Arena provides interactive iterative feedback (not just static benchmarks)
13
 
14
  ## Tasks
15
 
16
  | Task | Difficulty | Description | Max Steps |
17
  |------|-----------|-------------|-----------|
18
- | basic_select | Easy | SELECT, WHERE, ORDER BY on single table | 5 |
19
- | join_aggregate | Medium | Multi-table JOINs, GROUP BY, HAVING | 7 |
20
- | complex_analysis | Hard | CTEs, window functions, subqueries | 10 |
21
 
22
- Each difficulty has 3+ unique problems with deterministic, reproducible grading.
23
 
24
  ## Action Space
25
 
26
- ```json
27
- {
28
- "sql_query": "SELECT name, salary FROM employees WHERE salary > 80000 ORDER BY salary DESC"
29
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: SQL Arena
3
+ emoji: 🏟️
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ ---
9
+
10
  # SQL Arena - OpenEnv Environment
11
 
12
  An interactive SQL query challenge environment where AI agents learn to write SQL
 
18
  - Used by data analysts, business users, and developers daily
19
  - Evaluates reasoning, schema understanding, and query composition
20
  - Directly applicable to production AI assistants and copilots
 
21
 
22
  ## Tasks
23
 
24
  | Task | Difficulty | Description | Max Steps |
25
  |------|-----------|-------------|-----------|
26
+ | basic_select | Easy | SELECT, WHERE, ORDER BY | 5 |
27
+ | join_aggregate | Medium | JOINs, GROUP BY, HAVING | 7 |
28
+ | complex_analysis | Hard | CTEs, window functions | 10 |
29
 
30
+ Each difficulty has 3 unique problems with deterministic grading.
31
 
32
  ## Action Space
33
 
34
+ The agent sends a SQL query each step:
35
+
36
+ {"sql_query": "SELECT name, salary FROM employees WHERE salary > 80000"}
37
+
38
+ ## Observation Space
39
+
40
+ The agent receives back:
41
+
42
+ - schema_description: Database schema text
43
+ - question: Natural language question to answer
44
+ - query_result: Result table from last query
45
+ - error_message: Error if query failed
46
+ - feedback: Scoring feedback with hints
47
+ - expected_columns: Expected column names
48
+ - attempts_remaining: Steps left
49
+ - difficulty: Task difficulty level
50
+ - task_id: Problem identifier
51
+
52
+ ## Reward Function (0.0 to 1.0)
53
+
54
+ | Component | Weight | Description |
55
+ |-----------|--------|-------------|
56
+ | Execution | 0.10 | Query runs without error |
57
+ | Columns | 0.20 | Correct column names |
58
+ | Row Count | 0.20 | Correct number of rows |
59
+ | Values | 0.50 | Correct data values |
60
+
61
+ ## Setup
62
+
63
+ pip install -r requirements.txt
64
+
65
+ ## Run Server
66
+
67
+ uvicorn src.sql_arena.server:app --host 0.0.0.0 --port 7860
68
+
69
+ ## Run Inference
70
+
71
+ set HF_TOKEN=your_token
72
+ python inference.py
73
+
74
+ ## Docker
75
+
76
+ docker build -t sql-arena .
77
+ docker run -p 7860:7860 sql-arena
78
+
79
+ ## Run Tests
80
+
81
+ pytest tests/ -v
82
+
83
+ ## Project Structure
84
+
85
+ sql_arena/
86
+ - openenv.yaml (Environment metadata)
87
+ - Dockerfile (Container deployment)
88
+ - inference.py (Baseline inference script)
89
+ - src/sql_arena/
90
+ - models.py (Typed Pydantic models)
91
+ - environment.py (Core environment logic)
92
+ - tasks.py (9 SQL challenges)
93
+ - graders.py (Partial credit scoring)
94
+ - database.py (SQLite management)
95
+ - server.py (FastAPI server)
96
+ - tests/
97
+ - test_env.py (Test suite)
98
+
99
+ ## API Endpoints
100
+
101
+ | Method | Endpoint | Description |
102
+ |--------|----------|-------------|
103
+ | POST | /reset | Start new episode |
104
+ | POST | /step | Submit SQL query |
105
+ | GET | /state | Get current state |
106
+ | GET | /tasks | List available tasks |
107
+ | WS | /ws | WebSocket sessions |
108
+
109
+ ## License
110
+
111
+ MIT