--- title: Multi-Turn Text-to-SQL Agent emoji: ๐Ÿ—ƒ๏ธ colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 6.13.0 app_file: app.py pinned: false license: mit tags: - text-to-sql - agent - smolagents - multi-turn - clarification --- # ๐Ÿ—ƒ๏ธ Multi-Turn Text-to-SQL Agent with Clarification An intelligent SQL assistant that doesn't just generate SQL โ€” it **thinks before querying**. When your question is ambiguous, it asks for clarification first. When data doesn't exist, it tells you why and suggests alternatives. ## ๐ŸŽฏ What Makes This Different Traditional text-to-SQL systems blindly generate a query from your question. This agent follows a **3-step decision process** inspired by recent research: 1. **Classify** โ€” Is the question answerable, ambiguous, or unanswerable? 2. **Clarify** โ€” If ambiguous, ask the user targeted questions before generating SQL 3. **Execute & Verify** โ€” Generate SQL, run it, self-correct if errors occur ## ๐Ÿงช Try These Examples | Query | Expected Behavior | |-------|------------------| | `"Show me the top employees"` | ๐Ÿค” **Asks clarification** โ€” Top by salary? Orders handled? Tenure? | | `"Which customer spent the most?"` | โœ… **Answers directly** with SQL JOIN across orders/customers | | `"What's the customer satisfaction score?"` | โŒ **Explains** the data doesn't exist, suggests alternatives | | `"By salary, in Engineering"` (after ambiguous Q) | โœ… **Remembers context** and answers the clarified question | ## ๐Ÿ—๏ธ Architecture ``` User Question โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Intent Classifier โ”‚ โ† Answerable / Ambiguous / Unanswerable โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ” โ–ผ โ–ผ โ–ผ Clear Ambig N/A โ”‚ โ”‚ โ”‚ โ–ผ โ–ผ โ–ผ SQL Ask Explain Gen Clarify Why โ”‚ โ”‚ โ–ผ โ–ผ Execute User DB Reply โ”‚ โ”‚ โ–ผ โ””โ”€โ”€โ†’ (next turn) Results ``` ## ๐Ÿ“Š Demo Database The Space comes with a pre-loaded **company database** (6 tables, ~60 rows): - **departments** โ€” Engineering, Sales, Marketing, HR, Finance - **employees** โ€” 12 employees with salary, hire date, department, manager - **customers** โ€” 8 B2B customers with tiers (standard/premium/enterprise) - **products** โ€” 8 products (Hardware/Software) with price, cost, stock - **orders** โ€” 12 orders with status (completed/shipped/pending/cancelled) - **order_items** โ€” 17 line items with quantity, price, discount ## ๐Ÿ“š Research Foundation This agent's design draws from: | Paper | Key Contribution | |-------|-----------------| | [MMSQL](https://arxiv.org/abs/2412.17867) | 4-type question classification (answerable/ambiguous/unanswerable/improper) | | [PRACTIQ](https://arxiv.org/abs/2410.11076) | Multi-turn clarification dialogue patterns for SQL | | [SQLFixAgent](https://arxiv.org/abs/2406.13408) | Self-correcting SQL via ReAct reasoning | | [MTSQL-R1](https://arxiv.org/abs/2510.12831) | Agentic multi-turn SQL with memory verification | | [Disambiguate-then-Parse](https://arxiv.org/abs/2502.18448) | Interpretation generation for ambiguous queries | ## ๐Ÿ”ง Technical Stack - **Agent**: [smolagents](https://huggingface.co/docs/smolagents) `CodeAgent` with ReAct loop - **LLM**: Qwen/Qwen2.5-Coder-32B-Instruct via HF Inference API - **Database**: SQLite (in-memory demo) - **UI**: Gradio chat interface with multi-turn support ## ๐Ÿš€ Run Locally ```bash pip install smolagents[gradio] sqlalchemy export HF_TOKEN=your_token_here python app.py ``` ## License MIT