File size: 5,412 Bytes
a656655
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
# TextToSQL

TextToSQL is a web application that lets you query a SQLite database using plain English. The system converts your natural language questions into SQL queries using an NLP model, executes these queries on the database, and then returns the results through an interactive Gradio-based chat interface.

## Table of Contents
- [Installation](#installation)
- [Database Setup](#database-setup)
- [Running the Application](#running-the-application)
- [Project Structure and File Explanations](#project-structure-and-file-explanations)
- [Usage](#usage)
- [Customization](#customization)

## Installation

1. **Clone the Repository**

   ```bash

   git clone https://github.com/chithindocha/TextToSQL.git

   cd TextToSQL

   ```

2. **Install Required Modules**

   Ensure you have Python (version 3.7 or higher) installed. Install all necessary modules using the provided `requirements.txt` file:

   ```bash

   pip install -r requirements.txt

   ```

## Database Setup

Before running the application, you need to initialize the SQLite database:

1. **Run the Initialization Script**

   Execute the following command to create the SQLite database (`chat_assistant.db`) with the necessary tables and sample data:

   ```bash

   python db/init_db.py

   ```

   This script will:
   - Create the **employees** and **departments** tables.
   - Create metadata tables (**table_metadata** and **column_metadata**).
   - Insert sample data into the **employees** and **departments** tables.
   
   **Note:** Feel free to modify or extend the sample data in `db/init_db.py` as needed.

## Running the Application

Start the web application by running:

```bash

python app.py

```

This command launches a Gradio web interface where you can ask the SQL Chat Assistant questions in plain English. The assistant will translate your query into SQL, execute it against the database, and display the results.

## Project Structure and File Explanations

### 1. **db/database.py**

- **Purpose:**  
  This file contains the `Database` class, which manages the SQLite database connection, executes SQL queries, and handles closing the connection.

- **Key Methods:**
  - `connect()`: Establishes a connection to the SQLite database.
  - `execute_query(query, params=None)`: Executes a SQL query with optional parameters.
  - `close()`: Closes the active database connection.

### 2. **db/init_db.py**



- **Purpose:**  

  Initializes the SQLite database by creating the necessary tables and inserting sample data.



- **Key Operations:**

  - Creates the **employees**, **departments**, **table_metadata**, and **column_metadata** tables.

  - Inserts sample records into the **employees** and **departments** tables.

  - Adds metadata entries for table and column descriptions.



- **Usage:**  

  Run this file to set up the database. You can modify the data here if you wish to customize the sample entries.



### 3. **nlp/query_processor.py**

- **Purpose:**  
  Processes user queries by converting natural language input into SQL statements and executing them.

- **Key Components:**
  - **Date Conversion:**  
    Contains methods (`convert_date` and `extract_and_convert_date`) to detect and convert various date formats found in queries.

  - **Query Processing:**  

    Uses the `SQLModel` (from `nlp/sql_model.py`) to generate an SQL query based on the user's input, executes the query using the `Database` class, and returns the results.


### 4. **nlp/sql_model.py**



- **Purpose:**  

  Handles the translation of natural language queries into SQL commands.



- **Key Functionality:**

  - Utilizes the Hugging Face `InferenceClient` with a specified model (default: `"HuggingFaceH4/zephyr-7b-beta"`) to convert plain English queries into SQL.

  - Provides a detailed prompt with schema information and example queries to guide the model.

  - Cleans and extracts the SQL command from the model's output before returning it.



### 5. **app.py**



- **Purpose:**  

  Serves as the main entry point for the application and sets up the web interface.



- **Key Operations:**

  - Initializes the `Database` and `QueryProcessor` objects.

  - Defines a `respond` function that processes user queries and fetches results from the database.

  - Uses Gradio’s `ChatInterface` to build an interactive chat UI where users can type their queries and view responses.



## Usage



1. **Start the Application**



   Run the following command to launch the chat interface:



   ```bash

   python app.py

   ```



2. **Ask a Query**



   In the chat window, type your question in plain English (e.g., "Show me all employees in the Sales department"). The application will:

   - Convert your query to an SQL command.

   - Execute the command against the SQLite database.

   - Return and display the results in the chat window.



## Customization



- **Database Content:**  

  You can modify the sample data or the database schema by editing the `db/init_db.py` file. Re-run the script to update the database with your changes.



- **SQL Model Prompt:**  
  Adjust the prompt and example queries in `nlp/sql_model.py` to better fit your database schema or use cases.

- **User Interface:**  
  Customize the Gradio interface settings in `app.py` to change the title, description, or layout of the chat UI.