ar9avg commited on
Commit
9f7dd14
Β·
1 Parent(s): 965a112
Files changed (3) hide show
  1. README.md +159 -31
  2. backend/main.py +4 -3
  3. frontend/src/components/Header.tsx +2 -2
README.md CHANGED
@@ -1,51 +1,179 @@
1
- # SQL Agent OpenEnv
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- An OpenEnv-compliant RL environment for SQL generation, featuring:
4
 
5
- - **LinUCB contextual bandit** β€” selects repair strategies based on error context
6
- - **GEPA (Generative Evolutionary Prompt Adaptation)** β€” evolves the system prompt from failure patterns
7
- - **Multi-turn repair loop** β€” LLM receives full failure history so each retry learns from the previous error
8
- - **3 difficulty tiers** β€” easy / medium / hard benchmark tasks on a built-in e-commerce schema
9
- - **Shaped reward function** β€” success bonus, attempt penalty, error severity signal
10
- - **External database support** β€” connect any SQLite file or PostgreSQL database via connection string
11
 
12
- ## Background
13
 
14
- The original [gepa-tuned-sql-agent](https://github.com/Ar9av/gepa-tuned-sql-agent) explored three research ideas in a Next.js stack which i started keeping this hackathon in mind 1 week back and relaising the submission criteria had to migrate to python:
15
 
16
- 1. **Self-debug loop** β€” the agent critiques and fixes its own SQL errors without human intervention
17
- 2. **GEPA prompt evolution** β€” after user feedback, an LLM reflects on failures and evolves the system prompt
18
- 3. **Mini-RL environment** β€” a LinUCB contextual bandit learns which repair strategy works best for each error class
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- This repo preserves all three ideas and adapts them to a Python backend so the environment can be packaged as a Docker container and hosted on Hugging Face Spaces.
 
 
21
 
22
- ## Key differences from the original
 
 
 
23
 
24
- | | gepa-tuned-sql-agent | sql-agent-openenv (this repo) |
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  |---|---|---|
26
- | Backend | Next.js API routes (TypeScript) | FastAPI (Python) |
27
- | Frontend | Next.js pages | React + Vite (static, served by FastAPI) |
28
- | LLM | Azure OpenAI | HF Router (Qwen 2.5-72B) |
29
- | Deployment | Vercel / local | Hugging Face Spaces (Docker) |
30
- | DB support | SQLite, PostgreSQL, MySQL | SQLite file + PostgreSQL DSN |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ## OpenEnv API
33
 
 
 
34
  | Endpoint | Method | Description |
35
  |---|---|---|
36
- | `/reset` | POST | Start a new episode |
37
- | `/step` | POST | Execute one repair action |
38
- | `/state` | GET | Current environment state |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
- ## Connect your own database
41
 
42
- In the UI click **Connect DB** (top-right) and enter:
43
 
44
- - **SQLite:** `/path/to/your/database.db` or `:memory:`
45
- - **PostgreSQL:** `postgresql://user:password@host:5432/dbname`
46
 
47
- The agent auto-detects the dialect and adjusts its prompt accordingly.
 
 
48
 
49
- ## Demo
50
 
51
- Click **Demo** in the top-right to watch the agent fail, self-repair via RL, then improve through two GEPA prompt-evolution cycles (42% β†’ 91%).
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Self-Improving SQL Agent
3
+ emoji: 🧠
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ tags:
9
+ - sql
10
+ - reinforcement-learning
11
+ - contextual-bandit
12
+ - linucb
13
+ - gepa
14
+ ---
15
 
16
+ # Self-Improving SQL Agent
17
 
18
+ > **Live demo:** [huggingface.co/spaces/ar9av/sql-agent-openenv](https://huggingface.co/spaces/ar9av/sql-agent-openenv)
19
+ > **GitHub:** [Ar9av/sql-agent-openenv](https://github.com/Ar9av/sql-agent-openenv)
 
 
 
 
20
 
21
+ A SQL agent that gets better the more you use it. Ask questions in plain English β€” the agent writes SQL, executes it, and repairs its own mistakes using reinforcement learning. Every failure feeds back into a prompt evolution cycle (GEPA) that makes the next attempt smarter.
22
 
23
+ ---
24
 
25
+ ## What it does
26
+
27
+ 1. **Natural language β†’ SQL** β€” type a question, get a query
28
+ 2. **Self-repair loop** β€” if the SQL fails, the agent diagnoses the error and retries with a different strategy (up to 5 attempts). Each retry sees the full history of previous failures so it doesn't repeat the same mistake
29
+ 3. **Reinforcement learning** β€” a LinUCB contextual bandit learns which of 8 repair strategies works best for each error class (wrong column, bad JOIN, syntax error, wrong dialect, etc.)
30
+ 4. **Prompt evolution (GEPA)** β€” every N queries the system reflects on its failure patterns and rewrites its own system prompt to be more accurate going forward
31
+ 5. **Connect your own DB** β€” drop in any SQLite file or PostgreSQL connection string; the agent introspects the schema and generates relevant example questions automatically
32
+
33
+ ---
34
+
35
+ ## Quickstart
36
+
37
+ ### Run locally
38
+
39
+ ```bash
40
+ # 1. Clone
41
+ git clone https://github.com/Ar9av/sql-agent-openenv
42
+ cd sql-agent-openenv
43
+
44
+ # 2. Install backend dependencies
45
+ cd backend
46
+ pip install -r requirements.txt
47
+
48
+ # 3. Set environment variables
49
+ export HF_TOKEN=your_huggingface_token # required β€” no default
50
+ export API_BASE_URL=https://router.huggingface.co/v1 # optional
51
+ export MODEL_NAME=Qwen/Qwen2.5-72B-Instruct # optional
52
 
53
+ # 4. Build the frontend
54
+ cd ../frontend
55
+ npm install && npm run build
56
 
57
+ # 5. Start the server
58
+ cd ../backend
59
+ uvicorn main:app --host 0.0.0.0 --port 8000
60
+ ```
61
 
62
+ Open [http://localhost:8000](http://localhost:8000).
63
+
64
+ ### Run with Docker
65
+
66
+ ```bash
67
+ docker build -t self-improving-sql-agent .
68
+ docker run -p 7860:7860 \
69
+ -e HF_TOKEN=your_token \
70
+ self-improving-sql-agent
71
+ ```
72
+
73
+ ### Environment variables
74
+
75
+ | Variable | Default | Required |
76
  |---|---|---|
77
+ | `HF_TOKEN` | β€” | **Yes** |
78
+ | `API_BASE_URL` | `https://router.huggingface.co/v1` | No |
79
+ | `MODEL_NAME` | `Qwen/Qwen2.5-72B-Instruct` | No |
80
+ | `GEPA_OPTIMIZE_EVERY` | `4` | No |
81
+ | `DATA_DIR` | `./data` | No |
82
+
83
+ ---
84
+
85
+ ## Using the UI
86
+
87
+ ### Chat tab
88
+ Type any question about your data. The agent streams SQL token-by-token, executes it, and shows results in a table. If it fails, watch it diagnose the error and retry with a new strategy.
89
+
90
+ - **Correct / Wrong buttons** β€” rate the result. Wrong answers open a remark field; your feedback is fed directly into the next GEPA optimization cycle
91
+ - **Retry differently** β€” re-runs the query with the previous bad SQL as context so the agent avoids repeating the same approach
92
+
93
+ ### ER Diagram tab
94
+ Visual schema explorer showing all tables, columns, and foreign key relationships.
95
+
96
+ ### Benchmark tab *(built-in DB only)*
97
+ Run the agent against a fixed set of easy / medium / hard questions and get an overall accuracy score.
98
+
99
+ ### Right sidebar β€” System Prompt & GEPA
100
+ See the live system prompt the agent is using. A progress bar shows how far through the current optimization cycle you are (e.g. `2/4 Β· optimizes every 4 queries`). After each cycle the prompt is rewritten and the generation badge updates.
101
+
102
+ ### Connect your own database
103
+ Click **Connect DB** in the top-right:
104
+
105
+ - **SQLite:** `/path/to/database.db` or `:memory:`
106
+ - **PostgreSQL:** `postgresql://user:password@host:5432/dbname`
107
+
108
+ The agent auto-detects the dialect (SQLite vs PostgreSQL), adjusts its prompt, introspects the schema, and uses the LLM to generate 5 example questions specific to your data. The Benchmark tab and difficulty controls are hidden for custom databases.
109
+
110
+ ---
111
 
112
  ## OpenEnv API
113
 
114
+ The environment exposes a standard OpenEnv interface for agent training:
115
+
116
  | Endpoint | Method | Description |
117
  |---|---|---|
118
+ | `POST /reset` | β€” | Start a new episode, returns `Observation` |
119
+ | `POST /step` | β€” | Execute one repair action, returns `{observation, reward}` |
120
+ | `GET /state` | β€” | Current episode state |
121
+ | `GET /env/tasks` | β€” | List all tasks and questions |
122
+ | `GET /env/info` | β€” | Environment metadata (action/observation space) |
123
+
124
+ **Stdout** emits structured logs for each episode:
125
+ ```
126
+ [START] {"task_id": "...", "question": "...", "max_attempts": 5}
127
+ [STEP] {"attempt": 1, "action": "generate", "reward": 0.8, "success": true, "done": true}
128
+ [END] {"success": true, "attempts": 1, "total_reward": 0.8}
129
+ ```
130
+
131
+ **Action space** β€” 8 discrete repair strategies:
132
+ `generate`, `rewrite_full`, `fix_column`, `fix_table`, `add_groupby`, `rewrite_cte`, `fix_syntax`, `change_dialect`, `relax_filter`
133
+
134
+ ---
135
+
136
+ ## Architecture
137
+
138
+ ```
139
+ frontend/ React + Vite (served as static files by FastAPI)
140
+ backend/
141
+ main.py FastAPI entry point
142
+ api/
143
+ demo.py SSE streaming endpoints (chat, benchmark, GEPA events)
144
+ openenv.py OpenEnv spec routes (/reset, /step, /state)
145
+ env/
146
+ sql_env.py SQLAgentEnv β€” episode management, LLM calls
147
+ database.py SQLite + PostgreSQL abstraction
148
+ tasks.py Benchmark task definitions and grader
149
+ rl/
150
+ types.py RepairAction enum, RLState, featurize()
151
+ bandit.py LinUCB contextual bandit
152
+ repair_strategies.py 8 repair prompt templates
153
+ grader.py Shaped reward function
154
+ gepa/
155
+ optimizer.py GEPA: reflect β†’ mutate β†’ score β†’ pareto front
156
+ ```
157
+
158
+ ---
159
 
160
+ ## Background
161
 
162
+ > **Origin:** This is a port of [gepa-tuned-sql-agent](https://github.com/Ar9av/gepa-tuned-sql-agent) β€” originally built as a TypeScript/Next.js application β€” rewritten in Python (FastAPI + React/Vite) to match the OpenEnv format and deploy on Hugging Face Spaces.
163
 
164
+ The original explored three research ideas in a Next.js stack, started ~1 week before the submission deadline. When it became clear the submission required a Python OpenEnv environment, the whole stack was migrated.
 
165
 
166
+ 1. **Self-debug loop** β€” the agent critiques and fixes its own SQL errors without human intervention
167
+ 2. **GEPA prompt evolution** β€” after user feedback, an LLM reflects on failures and evolves the system prompt
168
+ 3. **Mini-RL environment** β€” a LinUCB contextual bandit learns which repair strategy works best for each error class
169
 
170
+ ### Key differences from the original
171
 
172
+ | | gepa-tuned-sql-agent | Self-Improving SQL Agent (this repo) |
173
+ |---|---|---|
174
+ | Backend | Next.js API routes (TypeScript) | FastAPI (Python) |
175
+ | Frontend | Next.js pages | React + Vite (static, served by FastAPI) |
176
+ | LLM | Azure OpenAI | HF Router (Qwen 2.5-72B) |
177
+ | Deployment | Vercel / local | Hugging Face Spaces (Docker) |
178
+ | DB support | SQLite, PostgreSQL, MySQL | SQLite file + PostgreSQL DSN |
179
+ | Repair context | Single-shot per attempt | Multi-turn β€” full failure history passed to each retry |
backend/main.py CHANGED
@@ -28,10 +28,11 @@ from api.openenv import router as openenv_router, ResetRequest, StepRequest, env
28
  from env.database import ensure_seeded
29
 
30
  app = FastAPI(
31
- title="SQL Agent OpenEnv",
32
  description=(
33
- "A SQL generation environment powered by a LinUCB contextual bandit "
34
- "and GEPA prompt evolution, built for the Meta + Hugging Face OpenEnv hackathon."
 
35
  ),
36
  version="1.0.0",
37
  )
 
28
  from env.database import ensure_seeded
29
 
30
  app = FastAPI(
31
+ title="Self-Improving SQL Agent",
32
  description=(
33
+ "A SQL generation environment that learns from its own mistakes. "
34
+ "Powered by a LinUCB contextual bandit for repair strategy selection "
35
+ "and GEPA prompt evolution for continuous self-improvement."
36
  ),
37
  version="1.0.0",
38
  )
frontend/src/components/Header.tsx CHANGED
@@ -37,10 +37,10 @@ export function Header({ onToggleLeft, onToggleRight, onDemo, onConnectDb }: Hea
37
  {/* Title */}
38
  <div>
39
  <h1 className="text-sm font-bold text-white tracking-tight leading-none">
40
- SQL Agent OpenEnv
41
  </h1>
42
  <p className="text-[10px] text-gray-600 hidden sm:block mt-0.5">
43
- Reinforcement Learning Environment
44
  </p>
45
  </div>
46
  </div>
 
37
  {/* Title */}
38
  <div>
39
  <h1 className="text-sm font-bold text-white tracking-tight leading-none">
40
+ Self-Improving SQL Agent
41
  </h1>
42
  <p className="text-[10px] text-gray-600 hidden sm:block mt-0.5">
43
+ LinUCB Β· GEPA Β· OpenEnv
44
  </p>
45
  </div>
46
  </div>