Spaces:

Shouvik99
/

LifeGuide

Sleeping

LifeGuide / features_to_add.txt

Shouvik599

Next set of feature improvements

e31b9ae 25 days ago

2.31 kB

	Contextual chunk expansion — when a chunk is retrieved, also fetch the surrounding chunks (±1) to avoid cut-off verses losing their meaning
	Hypothetical Document Embedding (HyDE) — generate a hypothetical ideal answer first, embed that, then search — dramatically improves recall for abstract questions

	Multi-turn conversation — add chat history using LangChain ConversationBufferMemory so users can ask follow-up questions like "Elaborate on the second point"
	Answer faithfulness scoring — use an LLM-as-judge step to self-check whether the answer is actually grounded in the retrieved chunks before returning it
	Query rewriting — if the user query is vague, have the LLM rephrase it into a better search query before retrieval (improves semantic matching)

	Multi-language support — ingest Arabic Quran + Sanskrit Gita alongside English translations; embed both and let users query in their preferred language
	Incremental ingestion — track which PDFs have been ingested (via a manifest file) so re-running ingest.py only processes new books, not the whole library
	Book versioning — support multiple translations of the same book (e.g. KJV vs NIV Bible) and let users choose

	Snippet preview on hover — show the actual retrieved passage when hovering over a source badge in the UI
	Query suggestions — after each answer, suggest 2-3 related follow-up questions
	Topic explorer — a sidebar with pre-grouped themes (Death & Afterlife, Compassion, Duty, Prayer) that users can browse
	Compare mode — a dedicated side-by-side view for "How does Book A vs Book B address X"

	Hallucination guardrail — run a separate verification pass checking every claim in the answer maps back to a retrieved chunk; flag or remove unsupported claims
	Out-of-scope detection — classify queries before retrieval; politely decline non-spiritual questions (e.g. "Write me code") with a prompt-level or classifier-level guard
	Rate limiting — add per-IP request throttling in FastAPI to prevent API key exhaustion
	API key security — move to server-side key storage properly; never expose NVIDIA_API_KEY or GEMINI_API_KEY in frontend calls


	Need to debug -
	1. General questions not citing verses
	2. For exact verses cache threshold score is returning same for chapter 2 verse 4 and chapter 1 verse 10
	3.