rag-document-qa / README.md
Nav772's picture
Update README.md
96ec534 verified

A newer version of the Gradio SDK is available: 6.12.0

Upgrade
metadata
title: RAG Document Q&A System
emoji: πŸ“š
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
license: mit

πŸ“š RAG Document Q&A System

A Retrieval-Augmented Generation (RAG) system that answers questions about uploaded PDF documents.

🎯 What This Does

  1. Upload a PDF document
  2. Process the document (chunks it and creates embeddings)
  3. Ask questions about the document
  4. Get accurate answers with source citations

πŸ—οΈ Architecture

User Question β†’ Embedding β†’ Vector Search β†’ Retrieved Chunks β†’ LLM β†’ Answer
Component Technology
Embeddings sentence-transformers/all-MiniLM-L6-v2 (384 dimensions)
Vector Store FAISS (Facebook AI Similarity Search)
Text Splitter RecursiveCharacterTextSplitter (1000 chars, 200 overlap)
LLM HuggingFaceH4/zephyr-7b-beta via Inference API
Framework LangChain + Gradio

πŸ› οΈ Development Challenges

This project encountered several technical challenges during development:

Challenge 1: LangChain API Changes

Problem: Import errors due to LangChain's package restructuring.

# Old (broken)
from langchain.document_loaders import PyPDFLoader
from langchain.chains import RetrievalQA

# New (working)
from langchain_community.document_loaders import PyPDFLoader
# RetrievalQA deprecated β†’ use LCEL chains instead

Lesson: Fast-evolving libraries require checking current documentation.

Challenge 2: PDF Download Issues

Problem: PdfStreamError: Stream has ended unexpectedly Cause: Incomplete download due to missing User-Agent header. Solution: Added proper headers to HTTP request.

Challenge 3: LLM Response Quality

Problem: FLAN-T5-Large produced fragment-like responses instead of complete answers. Attempted Solutions:

  1. Adjusted generation parameters β€” minimal improvement
  2. Modified prompt format β€” slight improvement
  3. Switched to FLAN-T5-XL β€” OOM error

Solution: Switched to Zephyr-7B-beta, which produces comprehensive answers.

Challenge 4: Hugging Face Spaces Python 3.13 Migration

Problem: Space failed on startup with ModuleNotFoundError: No module named 'audioop' Cause: Hugging Face Spaces updated to Python 3.13, which removed the deprecated audioop module from the standard library. Gradio 4.x depended on pydub, which required audioop. Solution: Upgraded to Gradio 6.3.0, which includes Python 3.13 compatibility fixes.

Challenge 5: Inference API Changes

Problem: InferenceClient.text_generation() failed with "task not supported" error. Cause: The Hugging Face Inference API routing changed, requiring conversational models to use the chat_completion endpoint. Solution: Refactored from raw prompt templates to the structured chat_completion() API with message arrays.

πŸ“ Limitations

  • Only processes PDF documents
  • English language only
  • Free Inference API has rate limits

πŸ‘€ Author

Nav772 - Built as part of AI Engineering portfolio

πŸ“š Related Projects