Spaces:

umanggarg
/

cartographer

Running

App Files Files Community

cartographer / PLAN.md

umanggarg

rebrand: update all display name references to Cartographer

410d1c8 about 1 month ago

preview code

raw

history blame contribute delete

5.42 kB

Cartographer — Build Plan

A RAG system that indexes GitHub repositories and answers natural language questions about their code, architecture, and documentation.

Learning Objectives

By the end of this project you will understand:

How RAG works on source code (not just documents)
AST-based code chunking vs. fixed character windows
Code-aware embeddings vs. general text embeddings
Metadata-rich retrieval (file, function, class, language, line numbers)
Hosted vector databases (Qdrant Cloud) and why they enable free deployment
Live deployment: frontend on Vercel, backend on Render, vectors on Qdrant Cloud
Claude Code features: CLAUDE.md, hooks, slash commands, subagents

Architecture Overview

GitHub URL
    │
    ▼
[Ingestion Pipeline]
    ├── Fetch repo via GitHub API (no clone needed for public repos)
    ├── Filter files by language — skip binaries, lock files, node_modules
    ├── Chunk by AST boundaries (functions, classes)
    │       └── fallback: character windows for markdown, config, plain text
    ├── Embed with nomic-embed-code (code-optimised model)
    └── Store in Qdrant Cloud
            └── metadata: repo, filepath, language,
                         function_name, class_name, start_line, end_line

    │
    ▼
[Query Pipeline]
    ├── Embed query with same model
    ├── Hybrid search (dense vector + sparse BM25, native in Qdrant)
    ├── Relevance threshold (reject out-of-domain queries)
    └── LLM generation (Groq / Claude)
            └── citations: filepath + line range

Phases

Phase 1 — Core Ingestion

ingestion/repo_fetcher.py — fetch file tree + content via GitHub API
ingestion/file_filter.py — include/exclude rules per language
ingestion/code_chunker.py — AST-based chunking for Python; character-window fallback for other file types
ingestion/embedder.py — embed chunks with nomic-ai/nomic-embed-code
ingestion/qdrant_store.py — upsert chunks into Qdrant Cloud collection

Phase 2 — Retrieval & Generation

retrieval/retrieval.py — hybrid search using Qdrant's native dense + sparse
backend/services/generation.py — LLM answer generation with code-aware system prompt
backend/services/ingestion_service.py — orchestrate full ingestion pipeline
FastAPI backend with /ingest, /query, /search endpoints

Phase 3 — UI

React + Vite frontend
Repo URL input instead of file upload
Citations show filepath + line numbers
Syntax-highlighted code chunks in source passages
Multi-repo selector in sidebar

Phase 4 — Live Deployment

Frontend → Vercel (free, static hosting)
Backend → Render (free tier — lightweight since no local ML model)
Vector DB → Qdrant Cloud (permanent free tier, 1GB)
Embeddings → Qdrant's built-in vectoriser or Voyage AI API (removes model from backend, keeps Render on free tier)
Environment variable setup, CORS configuration
GitHub Actions CI: lint + deploy on push to main

Phase 5 — Claude Code Features (Throughout)

CLAUDE.md — project briefing for Claude Code sessions
Hooks — auto-lint on file edit, reminder to update notes after commit
Slash commands — /ingest-repo, /search-code, /add-to-notes
Subagent patterns — parallel ingestion, expert review before PRs

Tech Stack

Layer	Choice	Why
Repo fetch	GitHub REST API	No local clone needed; works without git installed
Code parsing	`ast` (Python), `tree-sitter` (multi-lang)	Split at function/class boundaries
Embeddings	`nomic-ai/nomic-embed-code`	Fine-tuned on code, free, runs locally
Vector DB	Qdrant Cloud (free tier)	Permanent free 1GB, native hybrid search, enables deployment
LLM	Groq Llama 3.3 70B / Claude Haiku	Fast, cheap/free
Backend	FastAPI + Uvicorn	Lightweight, async, auto-docs
Frontend	React + Vite	Fast dev server, small production bundle
Frontend hosting	Vercel	Free, zero-config for Vite apps
Backend hosting	Render	Free tier works once model is removed from server
CI/CD	GitHub Actions	Lint and deploy on push

Deployment Architecture

User browser
    │
    ├── Static files ──→ Vercel (free)
    │                        React UI
    │
    └── API calls ──────→ Render (free)
                              FastAPI backend
                                  │
                                  ├──→ Qdrant Cloud (free)
                                  │        Vector storage + hybrid search
                                  │
                                  └──→ Groq API (free)
                                           LLM generation

The key insight: by using Qdrant Cloud for vector storage and a remote embedding API (instead of running the model on the server), the backend becomes a lightweight HTTP service with minimal RAM usage — fitting within Render's free tier (512MB RAM).

Notes Directory

notes/ is updated after every PR:

What was built
Key decisions made
Concepts learned
What's next

See notes/000-project-setup.md for the first entry.