Spaces:
Sleeping
Sleeping
Commit Β·
a28e972
1
Parent(s): e054490
Added Project Structure
Browse files
README.md
CHANGED
|
@@ -59,6 +59,7 @@ This system implements a complete RAG pipeline with the following components:
|
|
| 59 |
- Comprehensive error handling and validation
|
| 60 |
- Modular architecture for easy extension
|
| 61 |
|
|
|
|
| 62 |
## Local Development
|
| 63 |
|
| 64 |
### Prerequisites
|
|
@@ -100,43 +101,6 @@ python app/main.py
|
|
| 100 |
|
| 101 |
The application will start on `http://localhost:7860`
|
| 102 |
|
| 103 |
-
## Deployment to Hugging Face Spaces
|
| 104 |
-
|
| 105 |
-
### Method 1: Direct Upload
|
| 106 |
-
|
| 107 |
-
1. Create a new Space on [Hugging Face](https://huggingface.co/new-space)
|
| 108 |
-
2. Select "Gradio" as SDK
|
| 109 |
-
3. Upload repository files
|
| 110 |
-
4. Add repository secret:
|
| 111 |
-
- Navigate to Settings β Repository secrets
|
| 112 |
-
- Create `OPENROUTER_API_KEY` with your API key
|
| 113 |
-
5. Space will auto-deploy
|
| 114 |
-
|
| 115 |
-
### Method 2: Git Push
|
| 116 |
-
|
| 117 |
-
```bash
|
| 118 |
-
# Add Hugging Face remote
|
| 119 |
-
git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/SPACE_NAME
|
| 120 |
-
|
| 121 |
-
# Push to Hugging Face
|
| 122 |
-
git push hf main
|
| 123 |
-
```
|
| 124 |
-
|
| 125 |
-
**Important**: Ensure the YAML frontmatter (lines 1-9) remains at the top of README.md for proper Space configuration.
|
| 126 |
-
|
| 127 |
-
## Usage
|
| 128 |
-
|
| 129 |
-
1. **Upload Document**: Select PDF, DOCX, or TXT file (max recommended: 50MB)
|
| 130 |
-
2. **Process**: Click "Process Document" to chunk and index
|
| 131 |
-
3. **Query**: Ask natural language questions about the content
|
| 132 |
-
4. **Review**: Receive markdown-formatted answers with context
|
| 133 |
-
|
| 134 |
-
### Example Queries
|
| 135 |
-
|
| 136 |
-
- "What are the main conclusions of this research paper?"
|
| 137 |
-
- "Summarize the key points from section 3"
|
| 138 |
-
- "What methodology was used in this study?"
|
| 139 |
-
- "Extract all mentioned dates and events"
|
| 140 |
|
| 141 |
## Project Structure
|
| 142 |
|
|
@@ -158,56 +122,6 @@ ai-rag-document/
|
|
| 158 |
βββ README.md
|
| 159 |
```
|
| 160 |
|
| 161 |
-
## Technical Implementation Details
|
| 162 |
-
|
| 163 |
-
### Text Chunking Strategy
|
| 164 |
-
|
| 165 |
-
Uses `RecursiveCharacterTextSplitter` with:
|
| 166 |
-
- **Chunk size**: 1000 characters (balances context vs. precision)
|
| 167 |
-
- **Overlap**: 200 characters (prevents context loss at boundaries)
|
| 168 |
-
- **Metadata preservation**: Tracks source file and document type
|
| 169 |
-
|
| 170 |
-
### Embedding Model Selection
|
| 171 |
-
|
| 172 |
-
BAAI/bge-small-en-v1.5 chosen for:
|
| 173 |
-
- Superior performance on MTEB benchmark vs. all-MiniLM-L6-v2
|
| 174 |
-
- 384-dimension vectors (compact yet effective)
|
| 175 |
-
- Instruction-tuned for retrieval tasks
|
| 176 |
-
- L2 normalization for cosine similarity
|
| 177 |
-
|
| 178 |
-
### LLM Configuration
|
| 179 |
-
|
| 180 |
-
Google Gemma 3-4B-IT via OpenRouter:
|
| 181 |
-
- **Free tier**: No cost, suitable for demos and light production
|
| 182 |
-
- **Temperature 0.1**: Reduces hallucination, increases factuality
|
| 183 |
-
- **Max tokens 512**: Concise answers, faster responses
|
| 184 |
-
- **OpenRouter benefits**: Unified API, no vendor lock-in
|
| 185 |
-
|
| 186 |
-
### Prompt Engineering
|
| 187 |
-
|
| 188 |
-
The system uses a carefully designed prompt:
|
| 189 |
-
- Explicit instruction against hallucination
|
| 190 |
-
- Context grounding requirement
|
| 191 |
-
- Markdown formatting for readability
|
| 192 |
-
- Fallback response for insufficient context
|
| 193 |
-
|
| 194 |
-
## Testing
|
| 195 |
-
|
| 196 |
-
```bash
|
| 197 |
-
# Run tests
|
| 198 |
-
python -m pytest tests/
|
| 199 |
-
|
| 200 |
-
# Run specific test
|
| 201 |
-
python -m pytest tests/test_rag_pipeline.py -v
|
| 202 |
-
```
|
| 203 |
-
|
| 204 |
-
## Limitations and Considerations
|
| 205 |
-
|
| 206 |
-
- **Rate limit**: 10 queries/hour (configurable in `rag_pipeline.py`)
|
| 207 |
-
- **Document size**: Large files (>100MB) may cause memory issues
|
| 208 |
-
- **Context window**: Limited to 4 retrieved chunks per query
|
| 209 |
-
- **Free tier**: OpenRouter free tier has usage limits
|
| 210 |
-
|
| 211 |
## Future Enhancements
|
| 212 |
|
| 213 |
- Multi-document cross-referencing
|
|
@@ -225,4 +139,4 @@ This project is open source and available for portfolio and educational purposes
|
|
| 225 |
|
| 226 |
**Prateek Kumar Goel**
|
| 227 |
- GitHub: [@pkgprateek](https://github.com/pkgprateek)
|
| 228 |
-
- Project deployed on [Hugging Face Spaces](https://huggingface.co/spaces)
|
|
|
|
| 59 |
- Comprehensive error handling and validation
|
| 60 |
- Modular architecture for easy extension
|
| 61 |
|
| 62 |
+
---
|
| 63 |
## Local Development
|
| 64 |
|
| 65 |
### Prerequisites
|
|
|
|
| 101 |
|
| 102 |
The application will start on `http://localhost:7860`
|
| 103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
|
| 105 |
## Project Structure
|
| 106 |
|
|
|
|
| 122 |
βββ README.md
|
| 123 |
```
|
| 124 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
## Future Enhancements
|
| 126 |
|
| 127 |
- Multi-document cross-referencing
|
|
|
|
| 139 |
|
| 140 |
**Prateek Kumar Goel**
|
| 141 |
- GitHub: [@pkgprateek](https://github.com/pkgprateek)
|
| 142 |
+
- Project deployed on [Hugging Face Spaces](https://huggingface.co/spaces/pkgprateek)
|