Shouvik599 commited on
Commit Β·
26a5301
1
Parent(s): 56da115
Updated README
Browse files
README.md
CHANGED
|
@@ -1,190 +1,3 @@
|
|
| 1 |
-
|
| 2 |
-
Can you connect to my github account i have added as a connector
|
| 3 |
-
|
| 4 |
-
12:50 AM
|
| 5 |
-
Connector search
|
| 6 |
-
1 connector
|
| 7 |
-
Microsoft Learn
|
| 8 |
-
It looks like GitHub isn't showing up as a connected or available connector in the registry right now. A few things to check:
|
| 9 |
-
|
| 10 |
-
Make sure it's enabled in the Tools menu β Click the tools/connector icon in the chat interface and confirm GitHub is toggled on for this conversation.
|
| 11 |
-
Re-authenticate β Sometimes connectors need to be reconnected. Go to your connector settings and try reconnecting your GitHub account.
|
| 12 |
-
Check the connector was added to the right workspace β If you added it under a different account or session, it may not carry over.
|
| 13 |
-
Once it's connected and enabled, I'll be able to interact with your GitHub repos, issues, PRs, and more. Let me know if you run into any trouble setting it up!
|
| 14 |
-
|
| 15 |
-
Shouvik599/LifeGuide
|
| 16 |
-
main
|
| 17 |
-
|
| 18 |
-
GITHUB
|
| 19 |
-
|
| 20 |
-
from the mentioned list of features to add, choose and and work on it
|
| 21 |
-
|
| 22 |
-
12:51 AM
|
| 23 |
-
Looking at your features_to_add.txt, let me pick the best candidate based on impact vs. implementation effort for your current stack.
|
| 24 |
-
|
| 25 |
-
The strongest choice is Multi-turn conversation β it directly improves UX, fits naturally into your existing streaming architecture, and the LangChain ConversationBufferMemory pattern slots cleanly into rag_chain.py without touching ingestion or the vector store.
|
| 26 |
-
|
| 27 |
-
Let me also check what's in your features list more carefully before diving in.
|
| 28 |
-
|
| 29 |
-
The features listed are:
|
| 30 |
-
|
| 31 |
-
Contextual chunk expansion
|
| 32 |
-
HyDE (Hypothetical Document Embedding)
|
| 33 |
-
Multi-turn conversation β best ROI right now
|
| 34 |
-
Answer faithfulness scoring
|
| 35 |
-
Query rewriting
|
| 36 |
-
Multi-language support
|
| 37 |
-
Incremental ingestion
|
| 38 |
-
Book versioning
|
| 39 |
-
Snippet preview on hover
|
| 40 |
-
Query suggestions
|
| 41 |
-
Topic explorer
|
| 42 |
-
Compare mode
|
| 43 |
-
Hallucination guardrail
|
| 44 |
-
Out-of-scope detection
|
| 45 |
-
Rate limiting
|
| 46 |
-
API key security
|
| 47 |
-
I'll implement Multi-turn conversation β here's what changes:
|
| 48 |
-
|
| 49 |
-
rag_chain.py β session-based ConversationBufferMemory, history injected into the prompt
|
| 50 |
-
app.py β session ID via header/cookie, memory store per session
|
| 51 |
-
frontend/index.html β passes session_id, shows conversation continuity
|
| 52 |
-
Let me build it:
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
Check skills available
|
| 56 |
-
Check skills available
|
| 57 |
-
Now let me write the updated files. Starting with rag_chain.py:
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
Created 2 files
|
| 61 |
-
Created 2 files
|
| 62 |
-
Now the frontend β this is the most involved change. Adding session persistence, a "New Conversation" button, and follow-up UX cues:
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
Created a file, ran a command, read a file
|
| 66 |
-
Created a file, ran a command, read a file
|
| 67 |
-
|
| 68 |
-
Updated frontend index.html with multi-turn conversation support
|
| 69 |
-
index.html
|
| 70 |
-
|
| 71 |
-
Copy files to outputs
|
| 72 |
-
|
| 73 |
-
Script
|
| 74 |
-
|
| 75 |
-
Presented 3 files
|
| 76 |
-
Done
|
| 77 |
-
Here's everything that changed and why:
|
| 78 |
-
|
| 79 |
-
rag_chain.py
|
| 80 |
-
|
| 81 |
-
The core addition is an in-memory _session_store dict mapping session_id β [HumanMessage, AIMessage, ...]. The prompt template now uses MessagesPlaceholder(variable_name="history") so prior turns get injected directly into the LLM's context. Two important nuances: for follow-up questions that are too short/vague (under 8 words like "elaborate on point 2"), the retrieval query is augmented with the previous human turn so ChromaDB can find relevant chunks. The semantic cache is also bypassed for follow-ups since the same question in a different context should retrieve fresh context. History is trimmed to the last 6 turns (configurable via MAX_HISTORY_TURNS in .env) to avoid context window bloat.
|
| 82 |
-
|
| 83 |
-
app.py
|
| 84 |
-
|
| 85 |
-
Three new endpoints: POST /clear to reset a session's history, GET /history for debugging/inspection. The /ask endpoint now accepts an optional session_id in the request body and also reads/sets it via an httponly cookie as fallback. The session ID is echoed back in an X-Session-Id response header so the frontend can capture it.
|
| 86 |
-
|
| 87 |
-
frontend/index.html
|
| 88 |
-
|
| 89 |
-
The session ID is persisted to localStorage across page reloads. A session bar appears in the header once a conversation starts showing the turn count and a "βΊ New Conversation" button that calls /clear and resets the UI. User messages get a subtle follow-up pill badge after turn 1, and the textarea placeholder changes to hint that follow-ups work. The session_id is included in every /ask payload.
|
| 90 |
-
|
| 91 |
-
Rag chain
|
| 92 |
-
PY
|
| 93 |
-
App
|
| 94 |
-
PY
|
| 95 |
-
Index
|
| 96 |
-
Code Β· HTML
|
| 97 |
-
create a feature branch and push the changes to that branch
|
| 98 |
-
|
| 99 |
-
12:57 AM
|
| 100 |
-
Connector search
|
| 101 |
-
1 connector
|
| 102 |
-
Microsoft Learn
|
| 103 |
-
GitHub isn't available as a connected MCP tool, so I can't push directly. However, I can give you the exact commands to do it yourself β it'll take about 30 seconds.
|
| 104 |
-
|
| 105 |
-
bash
|
| 106 |
-
# 1. Navigate to your project root
|
| 107 |
-
cd /path/to/sacred-texts-rag
|
| 108 |
-
|
| 109 |
-
# 2. Make sure you're on main and up to date
|
| 110 |
-
git checkout main
|
| 111 |
-
git pull origin main
|
| 112 |
-
|
| 113 |
-
# 3. Create and switch to the feature branch
|
| 114 |
-
git checkout -b feature/multi-turn-conversation
|
| 115 |
-
|
| 116 |
-
# 4. Copy the updated files in (if you downloaded them)
|
| 117 |
-
# Or just overwrite in-place if you edited directly
|
| 118 |
-
|
| 119 |
-
# 5. Stage the three changed files
|
| 120 |
-
git add rag_chain.py app.py frontend/index.html
|
| 121 |
-
|
| 122 |
-
# 6. Commit
|
| 123 |
-
git commit -m "feat: add multi-turn conversation memory
|
| 124 |
-
|
| 125 |
-
- Add session-based ConversationBufferMemory in rag_chain.py
|
| 126 |
-
- Inject chat history via MessagesPlaceholder into LLM prompt
|
| 127 |
-
- Augment retrieval query for short follow-up questions
|
| 128 |
-
- Bypass semantic cache for follow-up turns
|
| 129 |
-
- Add /clear and /history endpoints in app.py
|
| 130 |
-
- Session ID via cookie + X-Session-Id header + request body
|
| 131 |
-
- Frontend: session bar, turn counter, follow-up pill badge,
|
| 132 |
-
localStorage session persistence, New Conversation button"
|
| 133 |
-
|
| 134 |
-
# 7. Push the branch
|
| 135 |
-
git push -u origin feature/multi-turn-conversation
|
| 136 |
-
Once pushed, GitHub will show a "Compare & pull request" banner on the repo page if you want to open a PR into main.
|
| 137 |
-
|
| 138 |
-
If you want to connect your GitHub account so I can do this directly next time, you can add the GitHub connector from the Tools menu in settings and I'll be able to create branches, commit, and push on your behalf.
|
| 139 |
-
|
| 140 |
-
Any update for the README file
|
| 141 |
-
|
| 142 |
-
1:05 AM
|
| 143 |
-
Yes, the README needs updating to reflect the new multi-turn conversation feature, the two new API endpoints, and the new environment variable. Here's the updated file:
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
Created a file, ran a command, read a file
|
| 147 |
-
Created a file, ran a command, read a file
|
| 148 |
-
|
| 149 |
-
Updated README.md with multi-turn conversation feature docs
|
| 150 |
-
README.md
|
| 151 |
-
|
| 152 |
-
Copy to outputs
|
| 153 |
-
|
| 154 |
-
Script
|
| 155 |
-
|
| 156 |
-
Presented file
|
| 157 |
-
Done
|
| 158 |
-
Here's what was updated vs. the original:
|
| 159 |
-
|
| 160 |
-
Added / changed sections:
|
| 161 |
-
|
| 162 |
-
The intro line now mentions multi-turn conversation
|
| 163 |
-
MAX_HISTORY_TURNS added to the environment variables table
|
| 164 |
-
The "How It Works" diagram now shows the two new steps β Session Memory and Query Augmentation β at the top of the pipeline
|
| 165 |
-
A new π¬ Multi-Turn Conversation section explains the follow-up UX with an example dialogue, how session IDs work, and the "New Conversation" button
|
| 166 |
-
The API Endpoints table now includes /clear and /history, and has a proper request/response example for /ask showing the NDJSON streaming format including cache hits
|
| 167 |
-
The notes section now documents the cache threshold and follow-up query augmentation behaviour
|
| 168 |
-
The features list at the bottom is cleaned up and deduplicated against what's already built
|
| 169 |
-
To push this along with the other files:
|
| 170 |
-
|
| 171 |
-
bash
|
| 172 |
-
git add README.md
|
| 173 |
-
git commit -m "docs: update README for multi-turn conversation feature"
|
| 174 |
-
git push
|
| 175 |
-
Readme
|
| 176 |
-
Document Β· MD
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
You've used 90% of your session limit
|
| 180 |
-
Upgrade
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
Claude is AI and can make mistakes. Please double-check responses.
|
| 185 |
-
Readme Β· MD
|
| 186 |
-
Copy
|
| 187 |
-
|
| 188 |
---
|
| 189 |
title: Sacred Texts RAG
|
| 190 |
emoji: ποΈ
|
|
@@ -194,15 +7,15 @@ sdk: docker
|
|
| 194 |
app_port: 7860
|
| 195 |
pinned: false
|
| 196 |
---
|
| 197 |
-
|
| 198 |
# ποΈ Sacred Texts RAG β Multi-Religion Knowledge Base
|
| 199 |
-
|
| 200 |
A Retrieval-Augmented Generation (RAG) application that answers spiritual queries using the Bhagavad Gita, Quran, Bible, and Guru Granth Sahib as the sole knowledge sources. Now with **multi-turn conversation memory** β ask follow-up questions naturally, just like a real dialogue.
|
| 201 |
-
|
| 202 |
---
|
| 203 |
-
|
| 204 |
## π Project Structure
|
| 205 |
-
|
| 206 |
```
|
| 207 |
sacred-texts-rag/
|
| 208 |
βββ README.md
|
|
@@ -214,22 +27,22 @@ sacred-texts-rag/
|
|
| 214 |
βββ frontend/
|
| 215 |
βββ index.html # Chat UI (served by FastAPI)
|
| 216 |
```
|
| 217 |
-
|
| 218 |
---
|
| 219 |
-
|
| 220 |
## βοΈ Setup Instructions
|
| 221 |
-
|
| 222 |
### 1. Install Dependencies
|
| 223 |
```bash
|
| 224 |
pip install -r requirements.txt
|
| 225 |
```
|
| 226 |
-
|
| 227 |
### 2. Configure Environment
|
| 228 |
```bash
|
| 229 |
cp .env.example .env
|
| 230 |
# Edit .env and add your NVIDIA_API_KEY
|
| 231 |
```
|
| 232 |
-
|
| 233 |
### 3. Add Your PDF Books
|
| 234 |
Place your PDF files in a `books/` folder:
|
| 235 |
```
|
|
@@ -239,7 +52,7 @@ books/
|
|
| 239 |
βββ bible.pdf
|
| 240 |
βββ guru_granth_sahib.pdf
|
| 241 |
```
|
| 242 |
-
|
| 243 |
### 4. Ingest the Books (Run Once)
|
| 244 |
```bash
|
| 245 |
python ingest.py
|
|
@@ -249,20 +62,20 @@ This will:
|
|
| 249 |
- Split into semantic chunks
|
| 250 |
- Create embeddings using NVIDIA's `llama-nemotron-embed-vl-1b-v2` model
|
| 251 |
- Store in a local ChromaDB vector store (`./chroma_db/`)
|
| 252 |
-
|
| 253 |
### 5. Start the Backend
|
| 254 |
```bash
|
| 255 |
python app.py
|
| 256 |
```
|
| 257 |
Server runs at: `http://localhost:7860`
|
| 258 |
-
|
| 259 |
### 6. Open the Frontend
|
| 260 |
Navigate to `http://localhost:7860` in your browser β the FastAPI server serves the UI directly.
|
| 261 |
-
|
| 262 |
---
|
| 263 |
-
|
| 264 |
## π Environment Variables
|
| 265 |
-
|
| 266 |
| Variable | Description | Default |
|
| 267 |
|---|---|---|
|
| 268 |
| `NVIDIA_API_KEY` | Your NVIDIA API key | β |
|
|
@@ -272,11 +85,11 @@ Navigate to `http://localhost:7860` in your browser β the FastAPI server serve
|
|
| 272 |
| `MAX_HISTORY_TURNS` | Max conversation turns kept in memory per session | `6` |
|
| 273 |
| `HOST` | Server bind host | `0.0.0.0` |
|
| 274 |
| `PORT` | Server port | `7860` |
|
| 275 |
-
|
| 276 |
---
|
| 277 |
-
|
| 278 |
## π§ How It Works
|
| 279 |
-
|
| 280 |
```
|
| 281 |
User Query
|
| 282 |
β
|
|
@@ -304,34 +117,34 @@ User Query
|
|
| 304 |
βΌ
|
| 305 |
Streamed response with source citations (book + chapter/verse)
|
| 306 |
```
|
| 307 |
-
|
| 308 |
---
|
| 309 |
-
|
| 310 |
## π¬ Multi-Turn Conversation
|
| 311 |
-
|
| 312 |
The app maintains per-session conversation history so you can ask natural follow-up questions:
|
| 313 |
-
|
| 314 |
```
|
| 315 |
You: "What do the scriptures say about forgiveness?"
|
| 316 |
AI: [Answer citing Gita, Quran, Bible, Guru Granth Sahib]
|
| 317 |
-
|
| 318 |
You: "Elaborate on the second point" β follow-up, no context needed
|
| 319 |
AI: [Continues from previous answer]
|
| 320 |
-
|
| 321 |
You: "What does the Bible say specifically?" οΏ½οΏ½οΏ½ drill-down
|
| 322 |
AI: [Focuses on Bible passages from the thread]
|
| 323 |
```
|
| 324 |
-
|
| 325 |
**How sessions work:**
|
| 326 |
- A session ID is created automatically on your first question and persisted in the browser's `localStorage`
|
| 327 |
- The server keeps the last `MAX_HISTORY_TURNS` (default: 6) human+AI pairs in memory
|
| 328 |
- Click **βΊ New Conversation** in the header to clear history and start fresh
|
| 329 |
- Sessions are scoped to the server process β they reset on server restart
|
| 330 |
-
|
| 331 |
---
|
| 332 |
-
|
| 333 |
## π API Endpoints
|
| 334 |
-
|
| 335 |
| Method | Endpoint | Description |
|
| 336 |
|---|---|---|
|
| 337 |
| `POST` | `/ask` | Ask a question; streams NDJSON response |
|
|
@@ -341,7 +154,7 @@ AI: [Focuses on Bible passages from the thread]
|
|
| 341 |
| `GET` | `/health` | Health check |
|
| 342 |
| `GET` | `/` | Serves the frontend UI |
|
| 343 |
| `GET` | `/docs` | Swagger UI |
|
| 344 |
-
|
| 345 |
### `/ask` Request Body
|
| 346 |
```json
|
| 347 |
{
|
|
@@ -349,7 +162,7 @@ AI: [Focuses on Bible passages from the thread]
|
|
| 349 |
"session_id": "optional-uuid-string"
|
| 350 |
}
|
| 351 |
```
|
| 352 |
-
|
| 353 |
### `/ask` Response (streamed NDJSON)
|
| 354 |
```json
|
| 355 |
{"type": "token", "data": "The Bhagavad Gita teaches..."}
|
|
@@ -357,21 +170,21 @@ AI: [Focuses on Bible passages from the thread]
|
|
| 357 |
{"type": "sources", "data": [{"book": "Bhagavad Gita 2:47", "page": "2:47", "snippet": "..."}]}
|
| 358 |
```
|
| 359 |
Cache hits return a single `{"type": "cache", "data": {"answer": "...", "sources": [...]}}` line.
|
| 360 |
-
|
| 361 |
---
|
| 362 |
-
|
| 363 |
## π Notes
|
| 364 |
-
|
| 365 |
- The LLM is instructed **never** to answer from outside the provided texts
|
| 366 |
- Each response includes **source citations** (book + chapter/verse where available)
|
| 367 |
- Responses synthesize wisdom **across all books** when relevant
|
| 368 |
- The semantic cache skips the LLM for repeated or near-identical questions (cosine distance < 0.35)
|
| 369 |
- Follow-up retrieval automatically augments vague short queries with the previous question for better semantic matching
|
| 370 |
-
|
| 371 |
---
|
| 372 |
-
|
| 373 |
## πΊοΈ Planned Features
|
| 374 |
-
|
| 375 |
- Contextual chunk expansion (fetch Β±1 surrounding chunks)
|
| 376 |
- HyDE β Hypothetical Document Embedding for abstract queries
|
| 377 |
- Answer faithfulness scoring (LLM-as-judge)
|
|
@@ -382,11 +195,9 @@ Cache hits return a single `{"type": "cache", "data": {"answer": "...", "sources
|
|
| 382 |
- Hallucination guardrail
|
| 383 |
- Out-of-scope detection
|
| 384 |
- Rate limiting & API key hardening
|
| 385 |
-
|
| 386 |
---
|
| 387 |
-
|
| 388 |
## π¬ Demo
|
| 389 |
-
|
| 390 |
-
App Link: https://shouvik99-lifeguide.hf.space/
|
| 391 |
-
|
| 392 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: Sacred Texts RAG
|
| 3 |
emoji: ποΈ
|
|
|
|
| 7 |
app_port: 7860
|
| 8 |
pinned: false
|
| 9 |
---
|
| 10 |
+
|
| 11 |
# ποΈ Sacred Texts RAG β Multi-Religion Knowledge Base
|
| 12 |
+
|
| 13 |
A Retrieval-Augmented Generation (RAG) application that answers spiritual queries using the Bhagavad Gita, Quran, Bible, and Guru Granth Sahib as the sole knowledge sources. Now with **multi-turn conversation memory** β ask follow-up questions naturally, just like a real dialogue.
|
| 14 |
+
|
| 15 |
---
|
| 16 |
+
|
| 17 |
## π Project Structure
|
| 18 |
+
|
| 19 |
```
|
| 20 |
sacred-texts-rag/
|
| 21 |
βββ README.md
|
|
|
|
| 27 |
βββ frontend/
|
| 28 |
βββ index.html # Chat UI (served by FastAPI)
|
| 29 |
```
|
| 30 |
+
|
| 31 |
---
|
| 32 |
+
|
| 33 |
## βοΈ Setup Instructions
|
| 34 |
+
|
| 35 |
### 1. Install Dependencies
|
| 36 |
```bash
|
| 37 |
pip install -r requirements.txt
|
| 38 |
```
|
| 39 |
+
|
| 40 |
### 2. Configure Environment
|
| 41 |
```bash
|
| 42 |
cp .env.example .env
|
| 43 |
# Edit .env and add your NVIDIA_API_KEY
|
| 44 |
```
|
| 45 |
+
|
| 46 |
### 3. Add Your PDF Books
|
| 47 |
Place your PDF files in a `books/` folder:
|
| 48 |
```
|
|
|
|
| 52 |
βββ bible.pdf
|
| 53 |
βββ guru_granth_sahib.pdf
|
| 54 |
```
|
| 55 |
+
|
| 56 |
### 4. Ingest the Books (Run Once)
|
| 57 |
```bash
|
| 58 |
python ingest.py
|
|
|
|
| 62 |
- Split into semantic chunks
|
| 63 |
- Create embeddings using NVIDIA's `llama-nemotron-embed-vl-1b-v2` model
|
| 64 |
- Store in a local ChromaDB vector store (`./chroma_db/`)
|
| 65 |
+
|
| 66 |
### 5. Start the Backend
|
| 67 |
```bash
|
| 68 |
python app.py
|
| 69 |
```
|
| 70 |
Server runs at: `http://localhost:7860`
|
| 71 |
+
|
| 72 |
### 6. Open the Frontend
|
| 73 |
Navigate to `http://localhost:7860` in your browser β the FastAPI server serves the UI directly.
|
| 74 |
+
|
| 75 |
---
|
| 76 |
+
|
| 77 |
## π Environment Variables
|
| 78 |
+
|
| 79 |
| Variable | Description | Default |
|
| 80 |
|---|---|---|
|
| 81 |
| `NVIDIA_API_KEY` | Your NVIDIA API key | β |
|
|
|
|
| 85 |
| `MAX_HISTORY_TURNS` | Max conversation turns kept in memory per session | `6` |
|
| 86 |
| `HOST` | Server bind host | `0.0.0.0` |
|
| 87 |
| `PORT` | Server port | `7860` |
|
| 88 |
+
|
| 89 |
---
|
| 90 |
+
|
| 91 |
## π§ How It Works
|
| 92 |
+
|
| 93 |
```
|
| 94 |
User Query
|
| 95 |
β
|
|
|
|
| 117 |
βΌ
|
| 118 |
Streamed response with source citations (book + chapter/verse)
|
| 119 |
```
|
| 120 |
+
|
| 121 |
---
|
| 122 |
+
|
| 123 |
## π¬ Multi-Turn Conversation
|
| 124 |
+
|
| 125 |
The app maintains per-session conversation history so you can ask natural follow-up questions:
|
| 126 |
+
|
| 127 |
```
|
| 128 |
You: "What do the scriptures say about forgiveness?"
|
| 129 |
AI: [Answer citing Gita, Quran, Bible, Guru Granth Sahib]
|
| 130 |
+
|
| 131 |
You: "Elaborate on the second point" β follow-up, no context needed
|
| 132 |
AI: [Continues from previous answer]
|
| 133 |
+
|
| 134 |
You: "What does the Bible say specifically?" οΏ½οΏ½οΏ½ drill-down
|
| 135 |
AI: [Focuses on Bible passages from the thread]
|
| 136 |
```
|
| 137 |
+
|
| 138 |
**How sessions work:**
|
| 139 |
- A session ID is created automatically on your first question and persisted in the browser's `localStorage`
|
| 140 |
- The server keeps the last `MAX_HISTORY_TURNS` (default: 6) human+AI pairs in memory
|
| 141 |
- Click **βΊ New Conversation** in the header to clear history and start fresh
|
| 142 |
- Sessions are scoped to the server process β they reset on server restart
|
| 143 |
+
|
| 144 |
---
|
| 145 |
+
|
| 146 |
## π API Endpoints
|
| 147 |
+
|
| 148 |
| Method | Endpoint | Description |
|
| 149 |
|---|---|---|
|
| 150 |
| `POST` | `/ask` | Ask a question; streams NDJSON response |
|
|
|
|
| 154 |
| `GET` | `/health` | Health check |
|
| 155 |
| `GET` | `/` | Serves the frontend UI |
|
| 156 |
| `GET` | `/docs` | Swagger UI |
|
| 157 |
+
|
| 158 |
### `/ask` Request Body
|
| 159 |
```json
|
| 160 |
{
|
|
|
|
| 162 |
"session_id": "optional-uuid-string"
|
| 163 |
}
|
| 164 |
```
|
| 165 |
+
|
| 166 |
### `/ask` Response (streamed NDJSON)
|
| 167 |
```json
|
| 168 |
{"type": "token", "data": "The Bhagavad Gita teaches..."}
|
|
|
|
| 170 |
{"type": "sources", "data": [{"book": "Bhagavad Gita 2:47", "page": "2:47", "snippet": "..."}]}
|
| 171 |
```
|
| 172 |
Cache hits return a single `{"type": "cache", "data": {"answer": "...", "sources": [...]}}` line.
|
| 173 |
+
|
| 174 |
---
|
| 175 |
+
|
| 176 |
## π Notes
|
| 177 |
+
|
| 178 |
- The LLM is instructed **never** to answer from outside the provided texts
|
| 179 |
- Each response includes **source citations** (book + chapter/verse where available)
|
| 180 |
- Responses synthesize wisdom **across all books** when relevant
|
| 181 |
- The semantic cache skips the LLM for repeated or near-identical questions (cosine distance < 0.35)
|
| 182 |
- Follow-up retrieval automatically augments vague short queries with the previous question for better semantic matching
|
| 183 |
+
|
| 184 |
---
|
| 185 |
+
|
| 186 |
## πΊοΈ Planned Features
|
| 187 |
+
|
| 188 |
- Contextual chunk expansion (fetch Β±1 surrounding chunks)
|
| 189 |
- HyDE β Hypothetical Document Embedding for abstract queries
|
| 190 |
- Answer faithfulness scoring (LLM-as-judge)
|
|
|
|
| 195 |
- Hallucination guardrail
|
| 196 |
- Out-of-scope detection
|
| 197 |
- Rate limiting & API key hardening
|
| 198 |
+
|
| 199 |
---
|
| 200 |
+
|
| 201 |
## π¬ Demo
|
|
|
|
|
|
|
|
|
|
| 202 |
|
| 203 |
+
App Link: https://shouvik99-lifeguide.hf.space/
|