Spaces:
Sleeping
Sleeping
| title: Multi-Turn Text-to-SQL Agent | |
| emoji: ποΈ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 6.13.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| tags: | |
| - text-to-sql | |
| - agent | |
| - smolagents | |
| - multi-turn | |
| - clarification | |
| # ποΈ Multi-Turn Text-to-SQL Agent with Clarification | |
| An intelligent SQL assistant that doesn't just generate SQL β it **thinks before querying**. When your question is ambiguous, it asks for clarification first. When data doesn't exist, it tells you why and suggests alternatives. | |
| ## π― What Makes This Different | |
| Traditional text-to-SQL systems blindly generate a query from your question. This agent follows a **3-step decision process** inspired by recent research: | |
| 1. **Classify** β Is the question answerable, ambiguous, or unanswerable? | |
| 2. **Clarify** β If ambiguous, ask the user targeted questions before generating SQL | |
| 3. **Execute & Verify** β Generate SQL, run it, self-correct if errors occur | |
| ## π§ͺ Try These Examples | |
| | Query | Expected Behavior | | |
| |-------|------------------| | |
| | `"Show me the top employees"` | π€ **Asks clarification** β Top by salary? Orders handled? Tenure? | | |
| | `"Which customer spent the most?"` | β **Answers directly** with SQL JOIN across orders/customers | | |
| | `"What's the customer satisfaction score?"` | β **Explains** the data doesn't exist, suggests alternatives | | |
| | `"By salary, in Engineering"` (after ambiguous Q) | β **Remembers context** and answers the clarified question | | |
| ## ποΈ Architecture | |
| ``` | |
| User Question | |
| β | |
| βΌ | |
| βββββββββββββββββββββββ | |
| β Intent Classifier β β Answerable / Ambiguous / Unanswerable | |
| βββββββββββ¬ββββββββββββ | |
| β | |
| βββββββΌββββββ | |
| βΌ βΌ βΌ | |
| Clear Ambig N/A | |
| β β β | |
| βΌ βΌ βΌ | |
| SQL Ask Explain | |
| Gen Clarify Why | |
| β β | |
| βΌ βΌ | |
| Execute User | |
| DB Reply | |
| β β | |
| βΌ ββββ (next turn) | |
| Results | |
| ``` | |
| ## π Demo Database | |
| The Space comes with a pre-loaded **company database** (6 tables, ~60 rows): | |
| - **departments** β Engineering, Sales, Marketing, HR, Finance | |
| - **employees** β 12 employees with salary, hire date, department, manager | |
| - **customers** β 8 B2B customers with tiers (standard/premium/enterprise) | |
| - **products** β 8 products (Hardware/Software) with price, cost, stock | |
| - **orders** β 12 orders with status (completed/shipped/pending/cancelled) | |
| - **order_items** β 17 line items with quantity, price, discount | |
| ## π Research Foundation | |
| This agent's design draws from: | |
| | Paper | Key Contribution | | |
| |-------|-----------------| | |
| | [MMSQL](https://arxiv.org/abs/2412.17867) | 4-type question classification (answerable/ambiguous/unanswerable/improper) | | |
| | [PRACTIQ](https://arxiv.org/abs/2410.11076) | Multi-turn clarification dialogue patterns for SQL | | |
| | [SQLFixAgent](https://arxiv.org/abs/2406.13408) | Self-correcting SQL via ReAct reasoning | | |
| | [MTSQL-R1](https://arxiv.org/abs/2510.12831) | Agentic multi-turn SQL with memory verification | | |
| | [Disambiguate-then-Parse](https://arxiv.org/abs/2502.18448) | Interpretation generation for ambiguous queries | | |
| ## π§ Technical Stack | |
| - **Agent**: [smolagents](https://huggingface.co/docs/smolagents) `CodeAgent` with ReAct loop | |
| - **LLM**: Qwen/Qwen2.5-Coder-32B-Instruct via HF Inference API | |
| - **Database**: SQLite (in-memory demo) | |
| - **UI**: Gradio chat interface with multi-turn support | |
| ## π Run Locally | |
| ```bash | |
| pip install smolagents[gradio] sqlalchemy | |
| export HF_TOKEN=your_token_here | |
| python app.py | |
| ``` | |
| ## License | |
| MIT | |