text-to-sql-agent / README.md
k1golestan's picture
Add comprehensive README
5a17891 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: Multi-Turn Text-to-SQL Agent
emoji: πŸ—ƒοΈ
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.13.0
app_file: app.py
pinned: false
license: mit
tags:
  - text-to-sql
  - agent
  - smolagents
  - multi-turn
  - clarification

πŸ—ƒοΈ Multi-Turn Text-to-SQL Agent with Clarification

An intelligent SQL assistant that doesn't just generate SQL β€” it thinks before querying. When your question is ambiguous, it asks for clarification first. When data doesn't exist, it tells you why and suggests alternatives.

🎯 What Makes This Different

Traditional text-to-SQL systems blindly generate a query from your question. This agent follows a 3-step decision process inspired by recent research:

  1. Classify β€” Is the question answerable, ambiguous, or unanswerable?
  2. Clarify β€” If ambiguous, ask the user targeted questions before generating SQL
  3. Execute & Verify β€” Generate SQL, run it, self-correct if errors occur

πŸ§ͺ Try These Examples

Query Expected Behavior
"Show me the top employees" πŸ€” Asks clarification β€” Top by salary? Orders handled? Tenure?
"Which customer spent the most?" βœ… Answers directly with SQL JOIN across orders/customers
"What's the customer satisfaction score?" ❌ Explains the data doesn't exist, suggests alternatives
"By salary, in Engineering" (after ambiguous Q) βœ… Remembers context and answers the clarified question

πŸ—οΈ Architecture

User Question
     β”‚
     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Intent Classifier   β”‚  ← Answerable / Ambiguous / Unanswerable
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
    β”Œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”
    β–Ό     β–Ό     β–Ό
  Clear  Ambig  N/A
    β”‚     β”‚     β”‚
    β–Ό     β–Ό     β–Ό
 SQL    Ask    Explain
 Gen   Clarify  Why
    β”‚     β”‚
    β–Ό     β–Ό
Execute  User
  DB    Reply
    β”‚     β”‚
    β–Ό     └──→ (next turn)
 Results

πŸ“Š Demo Database

The Space comes with a pre-loaded company database (6 tables, ~60 rows):

  • departments β€” Engineering, Sales, Marketing, HR, Finance
  • employees β€” 12 employees with salary, hire date, department, manager
  • customers β€” 8 B2B customers with tiers (standard/premium/enterprise)
  • products β€” 8 products (Hardware/Software) with price, cost, stock
  • orders β€” 12 orders with status (completed/shipped/pending/cancelled)
  • order_items β€” 17 line items with quantity, price, discount

πŸ“š Research Foundation

This agent's design draws from:

Paper Key Contribution
MMSQL 4-type question classification (answerable/ambiguous/unanswerable/improper)
PRACTIQ Multi-turn clarification dialogue patterns for SQL
SQLFixAgent Self-correcting SQL via ReAct reasoning
MTSQL-R1 Agentic multi-turn SQL with memory verification
Disambiguate-then-Parse Interpretation generation for ambiguous queries

πŸ”§ Technical Stack

  • Agent: smolagents CodeAgent with ReAct loop
  • LLM: Qwen/Qwen2.5-Coder-32B-Instruct via HF Inference API
  • Database: SQLite (in-memory demo)
  • UI: Gradio chat interface with multi-turn support

πŸš€ Run Locally

pip install smolagents[gradio] sqlalchemy
export HF_TOKEN=your_token_here
python app.py

License

MIT