Spaces:
Sleeping
Sleeping
| # Cartographer β Build Plan | |
| A RAG system that indexes GitHub repositories and answers natural language | |
| questions about their code, architecture, and documentation. | |
| --- | |
| ## Learning Objectives | |
| By the end of this project you will understand: | |
| - How RAG works on source code (not just documents) | |
| - AST-based code chunking vs. fixed character windows | |
| - Code-aware embeddings vs. general text embeddings | |
| - Metadata-rich retrieval (file, function, class, language, line numbers) | |
| - Hosted vector databases (Qdrant Cloud) and why they enable free deployment | |
| - Live deployment: frontend on Vercel, backend on Render, vectors on Qdrant Cloud | |
| - Claude Code features: CLAUDE.md, hooks, slash commands, subagents | |
| --- | |
| ## Architecture Overview | |
| ``` | |
| GitHub URL | |
| β | |
| βΌ | |
| [Ingestion Pipeline] | |
| βββ Fetch repo via GitHub API (no clone needed for public repos) | |
| βββ Filter files by language β skip binaries, lock files, node_modules | |
| βββ Chunk by AST boundaries (functions, classes) | |
| β βββ fallback: character windows for markdown, config, plain text | |
| βββ Embed with nomic-embed-code (code-optimised model) | |
| βββ Store in Qdrant Cloud | |
| βββ metadata: repo, filepath, language, | |
| function_name, class_name, start_line, end_line | |
| β | |
| βΌ | |
| [Query Pipeline] | |
| βββ Embed query with same model | |
| βββ Hybrid search (dense vector + sparse BM25, native in Qdrant) | |
| βββ Relevance threshold (reject out-of-domain queries) | |
| βββ LLM generation (Groq / Claude) | |
| βββ citations: filepath + line range | |
| ``` | |
| --- | |
| ## Phases | |
| ### Phase 1 β Core Ingestion | |
| - [ ] `ingestion/repo_fetcher.py` β fetch file tree + content via GitHub API | |
| - [ ] `ingestion/file_filter.py` β include/exclude rules per language | |
| - [ ] `ingestion/code_chunker.py` β AST-based chunking for Python; character-window fallback for other file types | |
| - [ ] `ingestion/embedder.py` β embed chunks with `nomic-ai/nomic-embed-code` | |
| - [ ] `ingestion/qdrant_store.py` β upsert chunks into Qdrant Cloud collection | |
| ### Phase 2 β Retrieval & Generation | |
| - [ ] `retrieval/retrieval.py` β hybrid search using Qdrant's native dense + sparse | |
| - [ ] `backend/services/generation.py` β LLM answer generation with code-aware system prompt | |
| - [ ] `backend/services/ingestion_service.py` β orchestrate full ingestion pipeline | |
| - [ ] FastAPI backend with `/ingest`, `/query`, `/search` endpoints | |
| ### Phase 3 β UI | |
| - [ ] React + Vite frontend | |
| - [ ] Repo URL input instead of file upload | |
| - [ ] Citations show filepath + line numbers | |
| - [ ] Syntax-highlighted code chunks in source passages | |
| - [ ] Multi-repo selector in sidebar | |
| ### Phase 4 β Live Deployment | |
| - [ ] **Frontend β Vercel** (free, static hosting) | |
| - [ ] **Backend β Render** (free tier β lightweight since no local ML model) | |
| - [ ] **Vector DB β Qdrant Cloud** (permanent free tier, 1GB) | |
| - [ ] **Embeddings β Qdrant's built-in vectoriser** or Voyage AI API (removes model from backend, keeps Render on free tier) | |
| - [ ] Environment variable setup, CORS configuration | |
| - [ ] GitHub Actions CI: lint + deploy on push to main | |
| ### Phase 5 β Claude Code Features (Throughout) | |
| - [ ] `CLAUDE.md` β project briefing for Claude Code sessions | |
| - [ ] Hooks β auto-lint on file edit, reminder to update notes after commit | |
| - [ ] Slash commands β `/ingest-repo`, `/search-code`, `/add-to-notes` | |
| - [ ] Subagent patterns β parallel ingestion, expert review before PRs | |
| --- | |
| ## Tech Stack | |
| | Layer | Choice | Why | | |
| |---|---|---| | |
| | Repo fetch | GitHub REST API | No local clone needed; works without git installed | | |
| | Code parsing | `ast` (Python), `tree-sitter` (multi-lang) | Split at function/class boundaries | | |
| | Embeddings | `nomic-ai/nomic-embed-code` | Fine-tuned on code, free, runs locally | | |
| | Vector DB | Qdrant Cloud (free tier) | Permanent free 1GB, native hybrid search, enables deployment | | |
| | LLM | Groq Llama 3.3 70B / Claude Haiku | Fast, cheap/free | | |
| | Backend | FastAPI + Uvicorn | Lightweight, async, auto-docs | | |
| | Frontend | React + Vite | Fast dev server, small production bundle | | |
| | Frontend hosting | Vercel | Free, zero-config for Vite apps | | |
| | Backend hosting | Render | Free tier works once model is removed from server | | |
| | CI/CD | GitHub Actions | Lint and deploy on push | | |
| --- | |
| ## Deployment Architecture | |
| ``` | |
| User browser | |
| β | |
| βββ Static files βββ Vercel (free) | |
| β React UI | |
| β | |
| βββ API calls βββββββ Render (free) | |
| FastAPI backend | |
| β | |
| ββββ Qdrant Cloud (free) | |
| β Vector storage + hybrid search | |
| β | |
| ββββ Groq API (free) | |
| LLM generation | |
| ``` | |
| The key insight: by using **Qdrant Cloud** for vector storage and a **remote | |
| embedding API** (instead of running the model on the server), the backend | |
| becomes a lightweight HTTP service with minimal RAM usage β fitting within | |
| Render's free tier (512MB RAM). | |
| --- | |
| ## Notes Directory | |
| `notes/` is updated after every PR: | |
| - What was built | |
| - Key decisions made | |
| - Concepts learned | |
| - What's next | |
| See `notes/000-project-setup.md` for the first entry. | |