Spaces:

Nav772
/

rag-document-qa

Sleeping

App Files Files Community

rag-document-qa / README.md

Nav772

Update README.md

96ec534 verified 3 months ago

preview code

raw

history blame contribute delete

3.53 kB

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

metadata

title: RAG Document Q&A System
emoji: 📚
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
license: mit

📚 RAG Document Q&A System

A Retrieval-Augmented Generation (RAG) system that answers questions about uploaded PDF documents.

🎯 What This Does

Upload a PDF document
Process the document (chunks it and creates embeddings)
Ask questions about the document
Get accurate answers with source citations

🏗️ Architecture

User Question → Embedding → Vector Search → Retrieved Chunks → LLM → Answer

Component	Technology
Embeddings	sentence-transformers/all-MiniLM-L6-v2 (384 dimensions)
Vector Store	FAISS (Facebook AI Similarity Search)
Text Splitter	RecursiveCharacterTextSplitter (1000 chars, 200 overlap)
LLM	HuggingFaceH4/zephyr-7b-beta via Inference API
Framework	LangChain + Gradio

🛠️ Development Challenges

This project encountered several technical challenges during development:

Challenge 1: LangChain API Changes

Problem: Import errors due to LangChain's package restructuring.

# Old (broken)
from langchain.document_loaders import PyPDFLoader
from langchain.chains import RetrievalQA

# New (working)
from langchain_community.document_loaders import PyPDFLoader
# RetrievalQA deprecated → use LCEL chains instead

Lesson: Fast-evolving libraries require checking current documentation.

Challenge 2: PDF Download Issues

Problem: PdfStreamError: Stream has ended unexpectedly Cause: Incomplete download due to missing User-Agent header. Solution: Added proper headers to HTTP request.

Challenge 3: LLM Response Quality

Problem: FLAN-T5-Large produced fragment-like responses instead of complete answers. Attempted Solutions:

Adjusted generation parameters — minimal improvement
Modified prompt format — slight improvement
Switched to FLAN-T5-XL — OOM error

Solution: Switched to Zephyr-7B-beta, which produces comprehensive answers.

Challenge 4: Hugging Face Spaces Python 3.13 Migration

Problem: Space failed on startup with ModuleNotFoundError: No module named 'audioop' Cause: Hugging Face Spaces updated to Python 3.13, which removed the deprecated audioop module from the standard library. Gradio 4.x depended on pydub, which required audioop. Solution: Upgraded to Gradio 6.3.0, which includes Python 3.13 compatibility fixes.

Challenge 5: Inference API Changes

Problem: InferenceClient.text_generation() failed with "task not supported" error. Cause: The Hugging Face Inference API routing changed, requiring conversational models to use the chat_completion endpoint. Solution: Refactored from raw prompt templates to the structured chat_completion() API with message arrays.