--- title: RAG Document Q&A System emoji: 📚 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 6.3.0 app_file: app.py pinned: false license: mit --- # 📚 RAG Document Q&A System A Retrieval-Augmented Generation (RAG) system that answers questions about uploaded PDF documents. ## 🎯 What This Does 1. **Upload** a PDF document 2. **Process** the document (chunks it and creates embeddings) 3. **Ask** questions about the document 4. **Get** accurate answers with source citations ## 🏗️ Architecture ``` User Question → Embedding → Vector Search → Retrieved Chunks → LLM → Answer ``` | Component | Technology | |-----------|------------| | Embeddings | sentence-transformers/all-MiniLM-L6-v2 (384 dimensions) | | Vector Store | FAISS (Facebook AI Similarity Search) | | Text Splitter | RecursiveCharacterTextSplitter (1000 chars, 200 overlap) | | LLM | HuggingFaceH4/zephyr-7b-beta via Inference API | | Framework | LangChain + Gradio | ## 🛠️ Development Challenges This project encountered several technical challenges during development: ### Challenge 1: LangChain API Changes **Problem:** Import errors due to LangChain's package restructuring. ```python # Old (broken) from langchain.document_loaders import PyPDFLoader from langchain.chains import RetrievalQA # New (working) from langchain_community.document_loaders import PyPDFLoader # RetrievalQA deprecated → use LCEL chains instead ``` **Lesson:** Fast-evolving libraries require checking current documentation. ### Challenge 2: PDF Download Issues **Problem:** `PdfStreamError: Stream has ended unexpectedly` **Cause:** Incomplete download due to missing User-Agent header. **Solution:** Added proper headers to HTTP request. ### Challenge 3: LLM Response Quality **Problem:** FLAN-T5-Large produced fragment-like responses instead of complete answers. **Attempted Solutions:** 1. Adjusted generation parameters — minimal improvement 2. Modified prompt format — slight improvement 3. Switched to FLAN-T5-XL — OOM error **Solution:** Switched to Zephyr-7B-beta, which produces comprehensive answers. ### Challenge 4: Hugging Face Spaces Python 3.13 Migration **Problem:** Space failed on startup with `ModuleNotFoundError: No module named 'audioop'` **Cause:** Hugging Face Spaces updated to Python 3.13, which removed the deprecated `audioop` module from the standard library. Gradio 4.x depended on `pydub`, which required `audioop`. **Solution:** Upgraded to Gradio 6.3.0, which includes Python 3.13 compatibility fixes. ### Challenge 5: Inference API Changes **Problem:** `InferenceClient.text_generation()` failed with "task not supported" error. **Cause:** The Hugging Face Inference API routing changed, requiring conversational models to use the `chat_completion` endpoint. **Solution:** Refactored from raw prompt templates to the structured `chat_completion()` API with message arrays. ## 📝 Limitations - Only processes PDF documents - English language only - Free Inference API has rate limits ## 👤 Author [Nav772](https://huggingface.co/Nav772) - Built as part of AI Engineering portfolio ## 📚 Related Projects - [Movie Sentiment Analyzer](https://huggingface.co/spaces/Nav772/movie-sentiment-analyzer) - [Amazon Review Rating Predictor](https://huggingface.co/spaces/Nav772/amazon-review-rating-predictor) - [Food Image Classifier](https://huggingface.co/spaces/Nav772/food-image-classifier) - [Sentiment Model Comparison](https://huggingface.co/spaces/Nav772/sentiment-model-comparison)