TextToSQL / README.md
Chit1324's picture
Upload README.md
a656655 verified
|
raw
history blame
5.41 kB
# TextToSQL
TextToSQL is a web application that lets you query a SQLite database using plain English. The system converts your natural language questions into SQL queries using an NLP model, executes these queries on the database, and then returns the results through an interactive Gradio-based chat interface.
## Table of Contents
- [Installation](#installation)
- [Database Setup](#database-setup)
- [Running the Application](#running-the-application)
- [Project Structure and File Explanations](#project-structure-and-file-explanations)
- [Usage](#usage)
- [Customization](#customization)
## Installation
1. **Clone the Repository**
```bash
git clone https://github.com/chithindocha/TextToSQL.git
cd TextToSQL
```
2. **Install Required Modules**
Ensure you have Python (version 3.7 or higher) installed. Install all necessary modules using the provided `requirements.txt` file:
```bash
pip install -r requirements.txt
```
## Database Setup
Before running the application, you need to initialize the SQLite database:
1. **Run the Initialization Script**
Execute the following command to create the SQLite database (`chat_assistant.db`) with the necessary tables and sample data:
```bash
python db/init_db.py
```
This script will:
- Create the **employees** and **departments** tables.
- Create metadata tables (**table_metadata** and **column_metadata**).
- Insert sample data into the **employees** and **departments** tables.
**Note:** Feel free to modify or extend the sample data in `db/init_db.py` as needed.
## Running the Application
Start the web application by running:
```bash
python app.py
```
This command launches a Gradio web interface where you can ask the SQL Chat Assistant questions in plain English. The assistant will translate your query into SQL, execute it against the database, and display the results.
## Project Structure and File Explanations
### 1. **db/database.py**
- **Purpose:**
This file contains the `Database` class, which manages the SQLite database connection, executes SQL queries, and handles closing the connection.
- **Key Methods:**
- `connect()`: Establishes a connection to the SQLite database.
- `execute_query(query, params=None)`: Executes a SQL query with optional parameters.
- `close()`: Closes the active database connection.
### 2. **db/init_db.py**
- **Purpose:**
Initializes the SQLite database by creating the necessary tables and inserting sample data.
- **Key Operations:**
- Creates the **employees**, **departments**, **table_metadata**, and **column_metadata** tables.
- Inserts sample records into the **employees** and **departments** tables.
- Adds metadata entries for table and column descriptions.
- **Usage:**
Run this file to set up the database. You can modify the data here if you wish to customize the sample entries.
### 3. **nlp/query_processor.py**
- **Purpose:**
Processes user queries by converting natural language input into SQL statements and executing them.
- **Key Components:**
- **Date Conversion:**
Contains methods (`convert_date` and `extract_and_convert_date`) to detect and convert various date formats found in queries.
- **Query Processing:**
Uses the `SQLModel` (from `nlp/sql_model.py`) to generate an SQL query based on the user's input, executes the query using the `Database` class, and returns the results.
### 4. **nlp/sql_model.py**
- **Purpose:**
Handles the translation of natural language queries into SQL commands.
- **Key Functionality:**
- Utilizes the Hugging Face `InferenceClient` with a specified model (default: `"HuggingFaceH4/zephyr-7b-beta"`) to convert plain English queries into SQL.
- Provides a detailed prompt with schema information and example queries to guide the model.
- Cleans and extracts the SQL command from the model's output before returning it.
### 5. **app.py**
- **Purpose:**
Serves as the main entry point for the application and sets up the web interface.
- **Key Operations:**
- Initializes the `Database` and `QueryProcessor` objects.
- Defines a `respond` function that processes user queries and fetches results from the database.
- Uses Gradio’s `ChatInterface` to build an interactive chat UI where users can type their queries and view responses.
## Usage
1. **Start the Application**
Run the following command to launch the chat interface:
```bash
python app.py
```
2. **Ask a Query**
In the chat window, type your question in plain English (e.g., "Show me all employees in the Sales department"). The application will:
- Convert your query to an SQL command.
- Execute the command against the SQLite database.
- Return and display the results in the chat window.
## Customization
- **Database Content:**
You can modify the sample data or the database schema by editing the `db/init_db.py` file. Re-run the script to update the database with your changes.
- **SQL Model Prompt:**
Adjust the prompt and example queries in `nlp/sql_model.py` to better fit your database schema or use cases.
- **User Interface:**
Customize the Gradio interface settings in `app.py` to change the title, description, or layout of the chat UI.