Spaces:
Sleeping
Sleeping
Merge pull request #7 from pkgprateek/feature/multi-upload-streaming
Browse files- README-HF.md +32 -27
- README.md +76 -83
- app/main.py +120 -53
- app/rag_pipeline.py +100 -0
README-HF.md
CHANGED
|
@@ -12,45 +12,52 @@ short_description: Document intelligence for Legal, Research, FinOps
|
|
| 12 |
full_width: true
|
| 13 |
---
|
| 14 |
|
| 15 |
-
#
|
| 16 |
|
| 17 |
-
**
|
| 18 |
|
| 19 |
-
Upload contracts, research papers, or financial reports → Ask questions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
---
|
| 22 |
|
| 23 |
## How It Works
|
| 24 |
|
| 25 |
-
```
|
| 26 |
-
|
| 27 |
-
A["📄 Upload"] --> B["✂️ Chunk"]
|
| 28 |
-
B --> C["🧠 Embed"]
|
| 29 |
-
C --> D["💬 Ask"]
|
| 30 |
-
D --> E["✨ Cited Answer"]
|
| 31 |
```
|
| 32 |
|
| 33 |
-
**3 steps**: Upload → Ask → Get answers with citations.
|
| 34 |
|
| 35 |
---
|
| 36 |
|
| 37 |
## Try It Now
|
| 38 |
|
| 39 |
-
1. **Select a vertical**
|
| 40 |
-
2. **
|
| 41 |
-
3. **
|
|
|
|
| 42 |
|
| 43 |
-
No signup required.
|
| 44 |
|
| 45 |
---
|
| 46 |
|
| 47 |
## Features
|
| 48 |
|
| 49 |
-
|
| 50 |
-
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
|
|
|
|
|
|
| 54 |
|
| 55 |
---
|
| 56 |
|
|
@@ -60,30 +67,28 @@ No signup required. Your documents are processed locally and auto-deleted after
|
|
| 60 |
git clone https://github.com/pkgprateek/rag-document-qa-workflow.git
|
| 61 |
cd rag-document-qa-workflow
|
| 62 |
echo "GROQ_API_KEY=your_key" > .env
|
| 63 |
-
echo "OPENROUTER_API_KEY=your_key" >> .env
|
| 64 |
docker compose up
|
| 65 |
# → http://localhost:7860
|
| 66 |
```
|
| 67 |
|
| 68 |
-
**
|
| 69 |
-
[View source on GitHub](https://github.com/pkgprateek/rag-document-qa-workflow)
|
| 70 |
|
| 71 |
---
|
| 72 |
|
| 73 |
## 🔒 Privacy
|
| 74 |
|
| 75 |
-
- Documents processed locally
|
| 76 |
-
-
|
| 77 |
- Auto-deleted after 7 days
|
| 78 |
-
- Never used for
|
| 79 |
|
| 80 |
---
|
| 81 |
|
| 82 |
## Enterprise Pilots
|
| 83 |
|
| 84 |
-
**2-week paid pilots** for teams ready to deploy RAG on their
|
| 85 |
|
| 86 |
-
📅 [Book discovery call](https://cal.com/
|
| 87 |
|
| 88 |
---
|
| 89 |
|
|
|
|
| 12 |
full_width: true
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# Enterprise RAG Platform
|
| 16 |
|
| 17 |
+
**Turn documents into answers. Instantly.**
|
| 18 |
|
| 19 |
+
Upload contracts, research papers, or financial reports → Ask questions → Get cited answers in seconds.
|
| 20 |
+
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
## ✨ What's New
|
| 24 |
+
|
| 25 |
+
- **Multi-document upload** — Process multiple files at once
|
| 26 |
+
- **Streaming answers** — Watch responses generate in real-time
|
| 27 |
+
- **Thinking indicator** — See "🔍 Analyzing documents..." before streaming starts
|
| 28 |
|
| 29 |
---
|
| 30 |
|
| 31 |
## How It Works
|
| 32 |
|
| 33 |
+
```
|
| 34 |
+
📄 Upload → ✂️ Chunk → 🧠 Embed → 💬 Ask → ✨ Cited Answer
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
```
|
| 36 |
|
| 37 |
+
**3 steps**: Upload your documents → Ask questions → Get answers with page citations.
|
| 38 |
|
| 39 |
---
|
| 40 |
|
| 41 |
## Try It Now
|
| 42 |
|
| 43 |
+
1. **Select a vertical** — Legal, Research, or FinOps samples pre-loaded
|
| 44 |
+
2. **Or upload your own** — PDF, DOCX, TXT supported (batch upload enabled)
|
| 45 |
+
3. **Ask anything** — Natural language questions
|
| 46 |
+
4. **Get streaming answers** — Watch the AI think and respond in real-time
|
| 47 |
|
| 48 |
+
No signup required. Documents auto-deleted after 7 days.
|
| 49 |
|
| 50 |
---
|
| 51 |
|
| 52 |
## Features
|
| 53 |
|
| 54 |
+
| Feature | Description |
|
| 55 |
+
|---------|-------------|
|
| 56 |
+
| **Multi-upload** | Upload multiple files at once |
|
| 57 |
+
| **Streaming** | Real-time token-by-token answers |
|
| 58 |
+
| **Citations** | Every answer links to source + page |
|
| 59 |
+
| **3 AI models** | GPT-OSS 120B, Llama 3.3, Gemma 3 |
|
| 60 |
+
| **Privacy** | Session isolation, 7-day auto-delete |
|
| 61 |
|
| 62 |
---
|
| 63 |
|
|
|
|
| 67 |
git clone https://github.com/pkgprateek/rag-document-qa-workflow.git
|
| 68 |
cd rag-document-qa-workflow
|
| 69 |
echo "GROQ_API_KEY=your_key" > .env
|
|
|
|
| 70 |
docker compose up
|
| 71 |
# → http://localhost:7860
|
| 72 |
```
|
| 73 |
|
| 74 |
+
**API Keys:** [Groq](https://console.groq.com/keys) (Required) · [OpenRouter](https://openrouter.ai/keys) (Optional)
|
|
|
|
| 75 |
|
| 76 |
---
|
| 77 |
|
| 78 |
## 🔒 Privacy
|
| 79 |
|
| 80 |
+
- Documents processed locally
|
| 81 |
+
- Session-isolated storage
|
| 82 |
- Auto-deleted after 7 days
|
| 83 |
+
- Never used for training
|
| 84 |
|
| 85 |
---
|
| 86 |
|
| 87 |
## Enterprise Pilots
|
| 88 |
|
| 89 |
+
**2-week paid pilots** for teams ready to deploy RAG on their infrastructure.
|
| 90 |
|
| 91 |
+
📅 [Book discovery call](https://cal.com/prateekgoel/30m-discovery-call)
|
| 92 |
|
| 93 |
---
|
| 94 |
|
README.md
CHANGED
|
@@ -1,80 +1,101 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
[](https://pkgprateek-ai-rag-document.hf.space/)
|
| 6 |
[](https://github.com/pkgprateek/ai-rag-document/actions/workflows/deploy-to-hf.yml)
|
| 7 |
[](https://www.python.org/downloads/)
|
| 8 |
-
[
|
| 71 |
docker compose up
|
| 72 |
```
|
| 73 |
|
| 74 |
-
Open **http://localhost:7860**
|
| 75 |
|
| 76 |
-
|
| 77 |
-
<summary>Alternative: UV (10× faster than pip)</summary>
|
| 78 |
|
| 79 |
```bash
|
| 80 |
uv venv && source .venv/bin/activate
|
|
@@ -82,32 +103,9 @@ uv pip install -r requirements.txt
|
|
| 82 |
python app/main.py
|
| 83 |
```
|
| 84 |
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
- [Groq API key](https://console.groq.com/keys) (Required - GPT-OSS & Llama models)
|
| 89 |
-
- [OpenRouter API key](https://openrouter.ai/keys) (Optional - Gemma model)
|
| 90 |
-
|
| 91 |
-
---
|
| 92 |
-
|
| 93 |
-
## Production Features Checklist
|
| 94 |
-
|
| 95 |
-
> 10 criteria for enterprise-grade RAG. Each is satisfied by this platform.
|
| 96 |
-
|
| 97 |
-
| Feature | Description |
|
| 98 |
-
|----------|----------|
|
| 99 |
-
| **Multi-format ingestion** | PDF, DOCX, TXT with intelligent parsing |
|
| 100 |
-
| **Semantic chunking** | 1000-char chunks, 200-char overlap |
|
| 101 |
-
| **Production embeddings** | bge-small-en-v1.5 (MTEB optimized) |
|
| 102 |
-
| **Persistent storage** | ChromaDB survives restarts |
|
| 103 |
-
| **Citation tracking** | Every answer links to source chunks |
|
| 104 |
-
| **Rate limiting** | 10 queries/hour (configurable) |
|
| 105 |
-
| **Privacy controls** | Auto-delete after 7 days |
|
| 106 |
-
| **Monitoring hooks** | Health checks, error logging |
|
| 107 |
-
| **Fast** | 50-200ms response time (p50) |
|
| 108 |
-
| **Portable** | Docker-ready, one-command deploy |
|
| 109 |
-
|
| 110 |
-
**[Design Decisions →](docs/DESIGN_DECISIONS.md)** — Deep dive into architectural choices.
|
| 111 |
|
| 112 |
---
|
| 113 |
|
|
@@ -115,30 +113,27 @@ python app/main.py
|
|
| 115 |
|
| 116 |
| Metric | Value |
|
| 117 |
|--------|-------|
|
| 118 |
-
| **
|
| 119 |
-
| **
|
| 120 |
-
| **100-page contract** | 3-4s process, 150ms query |
|
| 121 |
| **Citation accuracy** | 93-96% relevance |
|
| 122 |
-
| **
|
| 123 |
-
|
| 124 |
-
*Powered by Groq's lightning-fast inference and optimized retrieval*
|
| 125 |
|
| 126 |
---
|
| 127 |
|
| 128 |
-
##
|
| 129 |
|
| 130 |
-
**2-week paid pilots** for
|
| 131 |
|
| 132 |
| Week | Deliverables |
|
| 133 |
|------|--------------|
|
| 134 |
-
| **Week 1** |
|
| 135 |
-
| **Week 2** |
|
| 136 |
|
| 137 |
-
**Includes**
|
| 138 |
|
| 139 |
<p align="center">
|
| 140 |
-
<a href="https://cal.com/
|
| 141 |
-
<img src="https://img.shields.io/badge/📅_Book_Discovery_Call-
|
| 142 |
</a>
|
| 143 |
</p>
|
| 144 |
|
|
@@ -148,14 +143,12 @@ python app/main.py
|
|
| 148 |
|
| 149 |
**Prateek Kumar Goel**
|
| 150 |
|
| 151 |
-
[](https://github.com/pkgprateek)
|
| 153 |
[](https://huggingface.co/pkgprateek)
|
| 154 |
|
| 155 |
---
|
| 156 |
|
| 157 |
<p align="center">
|
| 158 |
-
<sub>
|
| 159 |
-
MIT License · Built with production-grade MLOps practices
|
| 160 |
-
</sub>
|
| 161 |
</p>
|
|
|
|
| 1 |
+
# Enterprise RAG Platform
|
| 2 |
|
| 3 |
+
<div align="center">
|
| 4 |
+
|
| 5 |
+
**Turn documents into answers. Instantly.**
|
| 6 |
+
|
| 7 |
+
Upload contracts, research papers, or financial reports. Ask questions in plain English. Get precise, cited answers in seconds.
|
| 8 |
|
| 9 |
[](https://pkgprateek-ai-rag-document.hf.space/)
|
| 10 |
[](https://github.com/pkgprateek/ai-rag-document/actions/workflows/deploy-to-hf.yml)
|
| 11 |
[](https://www.python.org/downloads/)
|
| 12 |
+
[](LICENSE)
|
| 13 |
|
| 14 |
+
<a href="https://pkgprateek-ai-rag-document.hf.space/">
|
| 15 |
+
<img src="assets/demo-screenshot.jpeg" alt="Enterprise RAG Demo" width="700"/>
|
| 16 |
+
</a>
|
| 17 |
+
|
| 18 |
+
</div>
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## The Problem
|
| 23 |
+
|
| 24 |
+
Knowledge workers spend **2.5 hours daily** searching for information buried in documents. Legal teams review contracts manually. Researchers dig through papers. Finance teams hunt for clauses in agreements.
|
| 25 |
+
|
| 26 |
+
## The Solution
|
| 27 |
+
|
| 28 |
+
**Enterprise RAG** eliminates that friction:
|
| 29 |
+
|
| 30 |
+
```
|
| 31 |
+
Upload documents → Ask questions → Get cited answers in <5 seconds
|
| 32 |
+
```
|
| 33 |
+
|
| 34 |
+
No more Ctrl+F. No more reading 50 pages to find one clause. Just ask.
|
| 35 |
|
| 36 |
---
|
| 37 |
|
| 38 |
+
## Features
|
| 39 |
|
| 40 |
+
| Feature | What You Get |
|
| 41 |
+
|---------|--------------|
|
| 42 |
+
| **Multi-document upload** | Process multiple files at once with batch progress |
|
| 43 |
+
| **Streaming answers** | Watch answers generate in real-time with thinking indicator |
|
| 44 |
+
| **Inline citations** | Every claim linked to source document + page number |
|
| 45 |
+
| **3 AI models** | GPT-OSS 120B, Llama 3.3 70B, Gemma 3 27B |
|
| 46 |
+
| **Session isolation** | Your documents are private to your session |
|
| 47 |
+
| **Auto-cleanup** | Documents auto-deleted after 7 days |
|
| 48 |
|
| 49 |
---
|
| 50 |
|
| 51 |
## Architecture
|
| 52 |
|
| 53 |
```mermaid
|
| 54 |
+
flowchart LR
|
| 55 |
+
subgraph Input
|
| 56 |
+
A[📄 PDF / DOCX / TXT]
|
|
|
|
|
|
|
| 57 |
end
|
| 58 |
|
| 59 |
+
subgraph Processing
|
| 60 |
+
B[✂️ Chunk<br/>1000 chars]
|
| 61 |
+
C[🧠 Embed<br/>bge-small-en-v1.5]
|
| 62 |
+
D[(💾 ChromaDB)]
|
| 63 |
end
|
| 64 |
|
| 65 |
+
subgraph Query
|
| 66 |
+
E[💬 Question]
|
| 67 |
+
F[🎯 Top-4 Retrieval]
|
| 68 |
+
G[🤖 LLM Stream]
|
| 69 |
+
H[📝 Cited Answer]
|
| 70 |
end
|
| 71 |
|
| 72 |
+
A --> B --> C --> D
|
| 73 |
+
E --> F --> G --> H
|
| 74 |
+
D --> F
|
|
|
|
|
|
|
| 75 |
```
|
| 76 |
|
| 77 |
+
**Stack:** LangChain · ChromaDB · sentence-transformers · Groq + OpenRouter
|
| 78 |
|
| 79 |
---
|
| 80 |
|
| 81 |
+
## Quick Start
|
| 82 |
+
|
| 83 |
+
### Docker (Recommended)
|
| 84 |
|
| 85 |
```bash
|
|
|
|
| 86 |
git clone https://github.com/pkgprateek/rag-document-qa-workflow.git
|
| 87 |
cd rag-document-qa-workflow
|
| 88 |
|
| 89 |
+
# Add your API keys
|
| 90 |
+
echo "GROQ_API_KEY=your_key" > .env
|
| 91 |
+
echo "OPENROUTER_API_KEY=your_key" >> .env
|
| 92 |
|
|
|
|
| 93 |
docker compose up
|
| 94 |
```
|
| 95 |
|
| 96 |
+
Open **http://localhost:7860**
|
| 97 |
|
| 98 |
+
### Local Development
|
|
|
|
| 99 |
|
| 100 |
```bash
|
| 101 |
uv venv && source .venv/bin/activate
|
|
|
|
| 103 |
python app/main.py
|
| 104 |
```
|
| 105 |
|
| 106 |
+
**Get Free API Keys:**
|
| 107 |
+
- [Groq](https://console.groq.com/keys) — Required (GPT-OSS, Llama)
|
| 108 |
+
- [OpenRouter](https://openrouter.ai/keys) — Optional (Gemma)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
|
| 110 |
---
|
| 111 |
|
|
|
|
| 113 |
|
| 114 |
| Metric | Value |
|
| 115 |
|--------|-------|
|
| 116 |
+
| **Query latency** | 50-200ms (p95) |
|
| 117 |
+
| **Document processing** | 3-4s for 100 pages |
|
|
|
|
| 118 |
| **Citation accuracy** | 93-96% relevance |
|
| 119 |
+
| **Streaming** | First token in <500ms |
|
|
|
|
|
|
|
| 120 |
|
| 121 |
---
|
| 122 |
|
| 123 |
+
## Enterprise Pilots
|
| 124 |
|
| 125 |
+
**2-week paid pilots** for teams ready to deploy RAG on their infrastructure:
|
| 126 |
|
| 127 |
| Week | Deliverables |
|
| 128 |
|------|--------------|
|
| 129 |
+
| **Week 1** | Document ingestion, chunking tuned for your domain |
|
| 130 |
+
| **Week 2** | Deployment, team training, ROI analysis |
|
| 131 |
|
| 132 |
+
**Includes:** Custom RAG system · Performance benchmarks · 30-day support
|
| 133 |
|
| 134 |
<p align="center">
|
| 135 |
+
<a href="https://cal.com/prateekgoel/30m-discovery-call">
|
| 136 |
+
<img src="https://img.shields.io/badge/📅_Book_Discovery_Call-00C853?style=for-the-badge" alt="Book Call"/>
|
| 137 |
</a>
|
| 138 |
</p>
|
| 139 |
|
|
|
|
| 143 |
|
| 144 |
**Prateek Kumar Goel**
|
| 145 |
|
| 146 |
+
[](https://huggingface.co/spaces/pkgprateek/ai-rag-document)
|
| 147 |
[](https://github.com/pkgprateek)
|
| 148 |
[](https://huggingface.co/pkgprateek)
|
| 149 |
|
| 150 |
---
|
| 151 |
|
| 152 |
<p align="center">
|
| 153 |
+
<sub>MIT License · Built with ❤️ for enterprise document intelligence</sub>
|
|
|
|
|
|
|
| 154 |
</p>
|
app/main.py
CHANGED
|
@@ -104,47 +104,65 @@ class DocumentRagApp:
|
|
| 104 |
except Exception as e:
|
| 105 |
yield f"❌ Error: {str(e)}", loaded_docs
|
| 106 |
|
| 107 |
-
def process_file(self,
|
| 108 |
-
"""Process uploaded file with live progress updates"""
|
| 109 |
loaded_docs = list(current_docs) if current_docs else []
|
| 110 |
|
| 111 |
-
if not
|
| 112 |
yield "⚠️ Please upload a file", loaded_docs
|
| 113 |
return
|
| 114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
try:
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
|
|
|
| 132 |
|
| 133 |
-
|
| 134 |
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
)
|
| 140 |
|
| 141 |
-
|
| 142 |
-
|
|
|
|
|
|
|
| 143 |
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 148 |
except Exception as e:
|
| 149 |
yield (
|
| 150 |
f"❌ Error: {str(e)}. Please try again or contact support.",
|
|
@@ -181,6 +199,24 @@ class DocumentRagApp:
|
|
| 181 |
except Exception as e:
|
| 182 |
return f"Error: {str(e)}"
|
| 183 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 184 |
def delete_document(self, doc_to_delete, session_id, current_docs):
|
| 185 |
"""
|
| 186 |
Delete a document from the session.
|
|
@@ -696,8 +732,9 @@ with gr.Blocks(css=css, theme=gr.themes.Base(), title="Enterprise RAG") as demo:
|
|
| 696 |
gr.Markdown("### OR UPLOAD DOCUMENTS", elem_classes="card-header")
|
| 697 |
file_upload = gr.File(
|
| 698 |
file_types=[".pdf", ".docx", ".txt"],
|
|
|
|
| 699 |
show_label=True,
|
| 700 |
-
height=240,
|
| 701 |
)
|
| 702 |
|
| 703 |
# Security Badge
|
|
@@ -755,7 +792,7 @@ with gr.Blocks(css=css, theme=gr.themes.Base(), title="Enterprise RAG") as demo:
|
|
| 755 |
elem_classes="doc-checkbox-group",
|
| 756 |
)
|
| 757 |
# Spacing before delete button
|
| 758 |
-
gr.HTML('<div style="height: 0.
|
| 759 |
with gr.Row():
|
| 760 |
remove_docs_btn = gr.Button(
|
| 761 |
"🗑️ Delete Selected Documents",
|
|
@@ -888,16 +925,35 @@ with gr.Blocks(css=css, theme=gr.themes.Base(), title="Enterprise RAG") as demo:
|
|
| 888 |
)
|
| 889 |
|
| 890 |
# File upload
|
| 891 |
-
def process_file_wrapper(
|
| 892 |
session_id = get_session_id(session_data)
|
| 893 |
-
|
|
|
|
|
|
|
| 894 |
checkbox_update, btn_update = update_doc_ui(docs)
|
| 895 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 896 |
|
| 897 |
process_btn.click(
|
| 898 |
fn=process_file_wrapper,
|
| 899 |
inputs=[file_upload, session_state, docs_state],
|
| 900 |
-
outputs=[
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 901 |
)
|
| 902 |
|
| 903 |
# Document deletion (batch removal via checkboxes)
|
|
@@ -933,50 +989,61 @@ with gr.Blocks(css=css, theme=gr.themes.Base(), title="Enterprise RAG") as demo:
|
|
| 933 |
fn=app.switch_model, inputs=model_selector, outputs=model_status
|
| 934 |
)
|
| 935 |
|
| 936 |
-
# Question answering -
|
| 937 |
-
def
|
| 938 |
session_id = get_session_id(session_data)
|
| 939 |
-
|
|
|
|
|
|
|
|
|
|
| 940 |
|
| 941 |
-
def
|
| 942 |
session_id = get_session_id(session_data)
|
| 943 |
-
|
|
|
|
| 944 |
|
| 945 |
-
def
|
| 946 |
session_id = get_session_id(session_data)
|
| 947 |
-
|
|
|
|
| 948 |
|
| 949 |
-
def
|
| 950 |
session_id = get_session_id(session_data)
|
| 951 |
-
|
|
|
|
|
|
|
|
|
|
| 952 |
|
| 953 |
-
def
|
| 954 |
session_id = get_session_id(session_data)
|
| 955 |
-
|
|
|
|
| 956 |
|
| 957 |
q1.click(
|
| 958 |
-
fn=
|
| 959 |
inputs=[session_state, docs_state],
|
| 960 |
outputs=answer,
|
| 961 |
)
|
| 962 |
q2.click(
|
| 963 |
-
fn=
|
| 964 |
inputs=[session_state, docs_state],
|
| 965 |
outputs=answer,
|
| 966 |
)
|
| 967 |
q3.click(
|
| 968 |
-
fn=
|
| 969 |
inputs=[session_state, docs_state],
|
| 970 |
outputs=answer,
|
| 971 |
)
|
| 972 |
q4.click(
|
| 973 |
-
fn=
|
| 974 |
inputs=[session_state, docs_state],
|
| 975 |
outputs=answer,
|
| 976 |
)
|
| 977 |
|
| 978 |
ask_btn.click(
|
| 979 |
-
fn=
|
|
|
|
|
|
|
| 980 |
)
|
| 981 |
|
| 982 |
if __name__ == "__main__":
|
|
|
|
| 104 |
except Exception as e:
|
| 105 |
yield f"❌ Error: {str(e)}", loaded_docs
|
| 106 |
|
| 107 |
+
def process_file(self, files, session_id, current_docs):
|
| 108 |
+
"""Process uploaded file(s) with live progress updates. Supports single or multiple files."""
|
| 109 |
loaded_docs = list(current_docs) if current_docs else []
|
| 110 |
|
| 111 |
+
if not files:
|
| 112 |
yield "⚠️ Please upload a file", loaded_docs
|
| 113 |
return
|
| 114 |
|
| 115 |
+
# Normalize to list (handles both single file and list of files)
|
| 116 |
+
file_list = files if isinstance(files, list) else [files]
|
| 117 |
+
total_files = len(file_list)
|
| 118 |
+
total_chunks = 0
|
| 119 |
+
processed_files = []
|
| 120 |
+
|
| 121 |
try:
|
| 122 |
+
for idx, file in enumerate(file_list, 1):
|
| 123 |
+
filename = os.path.basename(file.name)
|
| 124 |
+
yield f"📄 Processing {idx}/{total_files}: {filename}...", loaded_docs
|
| 125 |
+
|
| 126 |
+
ext = os.path.splitext(file.name)[1].lower()
|
| 127 |
+
if ext == ".pdf":
|
| 128 |
+
chunks = self.processor.process_pdf(file.name)
|
| 129 |
+
elif ext == ".txt":
|
| 130 |
+
chunks = self.processor.process_txt(file.name)
|
| 131 |
+
elif ext == ".docx":
|
| 132 |
+
chunks = self.processor.process_docx(file.name)
|
| 133 |
+
else:
|
| 134 |
+
yield (
|
| 135 |
+
f"⚠️ Skipped {filename}: Unsupported format (use PDF, DOCX, or TXT)",
|
| 136 |
+
loaded_docs,
|
| 137 |
+
)
|
| 138 |
+
continue
|
| 139 |
|
| 140 |
+
yield f"✂️ {filename}: Created {len(chunks)} chunks...", loaded_docs
|
| 141 |
|
| 142 |
+
# Pass session_id for user document isolation
|
| 143 |
+
self.rag_pipeline.add_documents(
|
| 144 |
+
chunks, session_id=session_id, is_sample=False
|
| 145 |
+
)
|
|
|
|
| 146 |
|
| 147 |
+
if filename not in loaded_docs:
|
| 148 |
+
loaded_docs.append(filename)
|
| 149 |
+
total_chunks += len(chunks)
|
| 150 |
+
processed_files.append(filename)
|
| 151 |
|
| 152 |
+
# Final success message
|
| 153 |
+
if processed_files:
|
| 154 |
+
if len(processed_files) == 1:
|
| 155 |
+
yield (
|
| 156 |
+
f"✓ Success! {processed_files[0]} ready ({total_chunks} searchable chunks)",
|
| 157 |
+
loaded_docs,
|
| 158 |
+
)
|
| 159 |
+
else:
|
| 160 |
+
yield (
|
| 161 |
+
f"✓ Success! {len(processed_files)} documents processed ({total_chunks} total chunks)",
|
| 162 |
+
loaded_docs,
|
| 163 |
+
)
|
| 164 |
+
else:
|
| 165 |
+
yield "⚠️ No valid documents to process", loaded_docs
|
| 166 |
except Exception as e:
|
| 167 |
yield (
|
| 168 |
f"❌ Error: {str(e)}. Please try again or contact support.",
|
|
|
|
| 199 |
except Exception as e:
|
| 200 |
return f"Error: {str(e)}"
|
| 201 |
|
| 202 |
+
def ask_stream(self, question, session_id, current_docs):
|
| 203 |
+
"""Stream answer with thinking indicator for real-time display."""
|
| 204 |
+
if not current_docs:
|
| 205 |
+
yield "Please load documents first"
|
| 206 |
+
return
|
| 207 |
+
if not question.strip():
|
| 208 |
+
yield "Please enter a question"
|
| 209 |
+
return
|
| 210 |
+
|
| 211 |
+
# Thinking indicator
|
| 212 |
+
yield "🔍 Analyzing documents..."
|
| 213 |
+
|
| 214 |
+
try:
|
| 215 |
+
for answer_text in self.rag_pipeline.query_stream(question, session_id):
|
| 216 |
+
yield answer_text
|
| 217 |
+
except Exception as e:
|
| 218 |
+
yield f"Error: {str(e)}"
|
| 219 |
+
|
| 220 |
def delete_document(self, doc_to_delete, session_id, current_docs):
|
| 221 |
"""
|
| 222 |
Delete a document from the session.
|
|
|
|
| 732 |
gr.Markdown("### OR UPLOAD DOCUMENTS", elem_classes="card-header")
|
| 733 |
file_upload = gr.File(
|
| 734 |
file_types=[".pdf", ".docx", ".txt"],
|
| 735 |
+
file_count="multiple", # Enable multi-file selection
|
| 736 |
show_label=True,
|
| 737 |
+
height=240,
|
| 738 |
)
|
| 739 |
|
| 740 |
# Security Badge
|
|
|
|
| 792 |
elem_classes="doc-checkbox-group",
|
| 793 |
)
|
| 794 |
# Spacing before delete button
|
| 795 |
+
gr.HTML('<div style="height: 0.01rem;"></div>')
|
| 796 |
with gr.Row():
|
| 797 |
remove_docs_btn = gr.Button(
|
| 798 |
"🗑️ Delete Selected Documents",
|
|
|
|
| 925 |
)
|
| 926 |
|
| 927 |
# File upload
|
| 928 |
+
def process_file_wrapper(files, session_data, current_docs):
|
| 929 |
session_id = get_session_id(session_data)
|
| 930 |
+
# Process files and yield progress
|
| 931 |
+
final_docs = current_docs
|
| 932 |
+
for status, docs in app.process_file(files, session_id, current_docs):
|
| 933 |
checkbox_update, btn_update = update_doc_ui(docs)
|
| 934 |
+
final_docs = docs
|
| 935 |
+
# During processing, keep file visible
|
| 936 |
+
yield status, docs, checkbox_update, btn_update, gr.update()
|
| 937 |
+
# After processing, clear the file upload for new uploads
|
| 938 |
+
checkbox_update, btn_update = update_doc_ui(final_docs)
|
| 939 |
+
yield (
|
| 940 |
+
gr.update(value=""),
|
| 941 |
+
final_docs,
|
| 942 |
+
checkbox_update,
|
| 943 |
+
btn_update,
|
| 944 |
+
gr.update(value=None),
|
| 945 |
+
)
|
| 946 |
|
| 947 |
process_btn.click(
|
| 948 |
fn=process_file_wrapper,
|
| 949 |
inputs=[file_upload, session_state, docs_state],
|
| 950 |
+
outputs=[
|
| 951 |
+
upload_status,
|
| 952 |
+
docs_state,
|
| 953 |
+
doc_checkboxes,
|
| 954 |
+
remove_docs_btn,
|
| 955 |
+
file_upload,
|
| 956 |
+
],
|
| 957 |
)
|
| 958 |
|
| 959 |
# Document deletion (batch removal via checkboxes)
|
|
|
|
| 989 |
fn=app.switch_model, inputs=model_selector, outputs=model_status
|
| 990 |
)
|
| 991 |
|
| 992 |
+
# Question answering - streaming handlers for all questions
|
| 993 |
+
def ask_termination_stream(session_data, current_docs):
|
| 994 |
session_id = get_session_id(session_data)
|
| 995 |
+
for text in app.ask_stream(
|
| 996 |
+
"What are the termination conditions?", session_id, current_docs
|
| 997 |
+
):
|
| 998 |
+
yield text
|
| 999 |
|
| 1000 |
+
def ask_payment_stream(session_data, current_docs):
|
| 1001 |
session_id = get_session_id(session_data)
|
| 1002 |
+
for text in app.ask_stream("Summarize payment terms", session_id, current_docs):
|
| 1003 |
+
yield text
|
| 1004 |
|
| 1005 |
+
def ask_findings_stream(session_data, current_docs):
|
| 1006 |
session_id = get_session_id(session_data)
|
| 1007 |
+
for text in app.ask_stream("Summarize key findings", session_id, current_docs):
|
| 1008 |
+
yield text
|
| 1009 |
|
| 1010 |
+
def ask_risks_stream(session_data, current_docs):
|
| 1011 |
session_id = get_session_id(session_data)
|
| 1012 |
+
for text in app.ask_stream(
|
| 1013 |
+
"What are the key risks mentioned?", session_id, current_docs
|
| 1014 |
+
):
|
| 1015 |
+
yield text
|
| 1016 |
|
| 1017 |
+
def ask_custom_stream(question, session_data, current_docs):
|
| 1018 |
session_id = get_session_id(session_data)
|
| 1019 |
+
for text in app.ask_stream(question, session_id, current_docs):
|
| 1020 |
+
yield text
|
| 1021 |
|
| 1022 |
q1.click(
|
| 1023 |
+
fn=ask_termination_stream,
|
| 1024 |
inputs=[session_state, docs_state],
|
| 1025 |
outputs=answer,
|
| 1026 |
)
|
| 1027 |
q2.click(
|
| 1028 |
+
fn=ask_payment_stream,
|
| 1029 |
inputs=[session_state, docs_state],
|
| 1030 |
outputs=answer,
|
| 1031 |
)
|
| 1032 |
q3.click(
|
| 1033 |
+
fn=ask_findings_stream,
|
| 1034 |
inputs=[session_state, docs_state],
|
| 1035 |
outputs=answer,
|
| 1036 |
)
|
| 1037 |
q4.click(
|
| 1038 |
+
fn=ask_risks_stream,
|
| 1039 |
inputs=[session_state, docs_state],
|
| 1040 |
outputs=answer,
|
| 1041 |
)
|
| 1042 |
|
| 1043 |
ask_btn.click(
|
| 1044 |
+
fn=ask_custom_stream,
|
| 1045 |
+
inputs=[question, session_state, docs_state],
|
| 1046 |
+
outputs=answer,
|
| 1047 |
)
|
| 1048 |
|
| 1049 |
if __name__ == "__main__":
|
app/rag_pipeline.py
CHANGED
|
@@ -388,6 +388,106 @@ Answer:""",
|
|
| 388 |
|
| 389 |
return {"answer": answer_text}
|
| 390 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 391 |
def _extract_citations(self, source_documents: List[Document]) -> List[dict]:
|
| 392 |
"""
|
| 393 |
Extract formatted citations from source documents with page numbers and previews.
|
|
|
|
| 388 |
|
| 389 |
return {"answer": answer_text}
|
| 390 |
|
| 391 |
+
def query_stream(self, question: str, session_id: str = None):
|
| 392 |
+
"""
|
| 393 |
+
Stream answer tokens for real-time display.
|
| 394 |
+
Yields tokens as they arrive from the LLM.
|
| 395 |
+
|
| 396 |
+
Args:
|
| 397 |
+
question: User's question string
|
| 398 |
+
session_id: User's session ID for filtering results
|
| 399 |
+
|
| 400 |
+
Yields:
|
| 401 |
+
str: Accumulated answer text (each yield contains full answer so far)
|
| 402 |
+
"""
|
| 403 |
+
# Check rate limit
|
| 404 |
+
if not self._check_rate_limit():
|
| 405 |
+
yield "⚠️ Rate limit exceeded. You can only ask 10 questions per hour. Please try again later."
|
| 406 |
+
return
|
| 407 |
+
|
| 408 |
+
# Set session ID for filtered retrieval
|
| 409 |
+
self._current_session_id = session_id
|
| 410 |
+
|
| 411 |
+
# Get documents using retriever (non-streaming part)
|
| 412 |
+
retriever = self.vector_store.as_retriever(search_kwargs={"k": 4})
|
| 413 |
+
docs = retriever.invoke(question)
|
| 414 |
+
|
| 415 |
+
# Filter by session
|
| 416 |
+
if session_id:
|
| 417 |
+
docs = [
|
| 418 |
+
d
|
| 419 |
+
for d in docs
|
| 420 |
+
if d.metadata.get("session_id") == session_id
|
| 421 |
+
or d.metadata.get("is_sample", False)
|
| 422 |
+
]
|
| 423 |
+
|
| 424 |
+
if not docs:
|
| 425 |
+
yield "I couldn't find relevant information in your documents. Please try rephrasing your question."
|
| 426 |
+
return
|
| 427 |
+
|
| 428 |
+
# Build context and sources
|
| 429 |
+
context = "\n\n".join([d.page_content for d in docs])
|
| 430 |
+
sources = ", ".join(
|
| 431 |
+
list(set([d.metadata.get("source", "").split("/")[-1] for d in docs]))
|
| 432 |
+
)
|
| 433 |
+
|
| 434 |
+
# Format prompt
|
| 435 |
+
prompt = self._format_prompt(context, sources, question)
|
| 436 |
+
|
| 437 |
+
# Stream from LLM
|
| 438 |
+
full_answer = ""
|
| 439 |
+
for chunk in self.llm.stream(prompt):
|
| 440 |
+
if hasattr(chunk, "content"):
|
| 441 |
+
full_answer += chunk.content
|
| 442 |
+
else:
|
| 443 |
+
full_answer += str(chunk)
|
| 444 |
+
yield full_answer
|
| 445 |
+
|
| 446 |
+
def _format_prompt(self, context: str, sources: str, question: str) -> str:
|
| 447 |
+
"""
|
| 448 |
+
Format the RAG prompt with context, sources, and question.
|
| 449 |
+
|
| 450 |
+
Args:
|
| 451 |
+
context: Retrieved document content
|
| 452 |
+
sources: Comma-separated source filenames
|
| 453 |
+
question: User's question
|
| 454 |
+
|
| 455 |
+
Returns:
|
| 456 |
+
str: Formatted prompt string
|
| 457 |
+
"""
|
| 458 |
+
return f"""You are an expert AI assistant specializing in document analysis. Your goal is to provide comprehensive, accurate, and well-cited answers.
|
| 459 |
+
|
| 460 |
+
Available Documents: {sources}
|
| 461 |
+
|
| 462 |
+
Context from Documents:
|
| 463 |
+
{context}
|
| 464 |
+
|
| 465 |
+
User Question: {question}
|
| 466 |
+
|
| 467 |
+
INSTRUCTIONS FOR YOUR RESPONSE:
|
| 468 |
+
1. **Analyze Thoroughly**: Read the context carefully and identify all relevant information
|
| 469 |
+
2. **Answer Comprehensively**: Provide a complete, detailed answer that fully addresses the question
|
| 470 |
+
3. **Use Proper Structure**:
|
| 471 |
+
- Start with a clear, direct answer
|
| 472 |
+
- Follow with supporting details and explanation
|
| 473 |
+
- Use markdown formatting (headings, bullet points, bold) for readability
|
| 474 |
+
4. **Cite Sources Inline**: As you make specific claims, cite the source immediately
|
| 475 |
+
- Format: (Source: filename, Page X) or (Source: filename) if page unknown
|
| 476 |
+
- Example: "The termination period is 30 days (Source: service_agreement.pdf, Page 3)"
|
| 477 |
+
- Be specific about which document and page number whenever possible
|
| 478 |
+
5. **Include a Sources Section**: At the end of your answer, add:
|
| 479 |
+
**Sources Referenced:**
|
| 480 |
+
• filename (Page X) - Brief note about what info came from here
|
| 481 |
+
• filename2 (Page Y) - Brief note
|
| 482 |
+
|
| 483 |
+
6. **Quality Standards**:
|
| 484 |
+
- Be specific and precise with facts, numbers, dates, and terms
|
| 485 |
+
- Quote exact phrases when important (use quotation marks)
|
| 486 |
+
- If information is unclear or missing, state what's uncertain
|
| 487 |
+
- Connect related points to create a cohesive narrative
|
| 488 |
+
|
| 489 |
+
Answer:"""
|
| 490 |
+
|
| 491 |
def _extract_citations(self, source_documents: List[Document]) -> List[dict]:
|
| 492 |
"""
|
| 493 |
Extract formatted citations from source documents with page numbers and previews.
|