Spaces:
Sleeping
Sleeping
docs+reliability: README accuracy pass + artifact-cache failure tracking
Browse filesREADME:
- Fix tool count: 10 β 12 (add glob and grep to the agent tools table)
- Document the two-tier LLM strategy (free runtime cascade + opt-in
Sonnet 4.6 premium tier for cached artifacts)
diagram_service: track LLM enrichment success and save with model=None
on failure, so the protection rule can replace degraded artifacts later
instead of locking in a partial result.
prebake_repos: collect per-step failures and reflect them in the exit
code + log, so a "successful" prebake actually means every artifact
landed (repo_map, tour, all diagrams, README) β not just ingestion.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- README.md +7 -3
- backend/services/diagram_service.py +25 -9
- scripts/prebake_repos.py +11 -5
README.md
CHANGED
|
@@ -11,7 +11,7 @@ pinned: false
|
|
| 11 |
|
| 12 |
**A production-grade RAG system that maps any GitHub repository β built from scratch, without LangChain or LlamaIndex.**
|
| 13 |
|
| 14 |
-
Index any public repo and ask natural-language questions about its code. Cartographer retrieves the exact functions and classes relevant to your question, explains them with source citations, and can autonomously investigate complex questions across multiple files using an agent with
|
| 15 |
|
| 16 |
**Live:** [cartographer-app.vercel.app](https://cartographer-app.vercel.app) Β· **Backend:** [HuggingFace Spaces](https://huggingface.co/spaces/umanggarg/cartographer)
|
| 17 |
|
|
@@ -23,7 +23,7 @@ Most RAG tutorials wrap a library and call it done. This project implements ever
|
|
| 23 |
|
| 24 |
- **Ingestion** β GitHub API β AST-based code chunking β contextual LLM descriptions β dual-vector embedding β Qdrant Cloud
|
| 25 |
- **Retrieval** β HyDE + query expansion + native hybrid search (dense + BM25) + cross-encoder reranking β each stage independently improving recall and precision
|
| 26 |
-
- **Agent** β a ReAct loop with
|
| 27 |
- **UI** β the pipeline is visible: every retrieved chunk, agent thought, tool call, and confidence grade is shown to the user
|
| 28 |
|
| 29 |
The result is both a useful tool and a study in how production AI systems are actually built.
|
|
@@ -88,6 +88,8 @@ Top-8 reranked chunks are injected as numbered sources. The LLM cites by `[1]`,
|
|
| 88 |
Cerebras llama-3.3-70b (2600 tok/s, fastest) β Groq β Gemini β OpenRouter β Anthropic
|
| 89 |
```
|
| 90 |
|
|
|
|
|
|
|
| 91 |
---
|
| 92 |
|
| 93 |
## Agent Mode
|
|
@@ -100,7 +102,7 @@ The agent communicates with tools via **MCP** β an open protocol for wiring LL
|
|
| 100 |
|
| 101 |
This means every tool works with any MCP-compatible client, not just our agent.
|
| 102 |
|
| 103 |
-
###
|
| 104 |
|
| 105 |
| Tool | What it does |
|
| 106 |
|------|-------------|
|
|
@@ -109,6 +111,8 @@ This means every tool works with any MCP-compatible client, not just our agent.
|
|
| 109 |
| `read_file` | Read any indexed file in full |
|
| 110 |
| `get_file_chunk` | Read a precise line range from a file |
|
| 111 |
| `list_files` | List all indexed files in a repo or subdirectory |
|
|
|
|
|
|
|
| 112 |
| `find_callers` | Find every call site of a function across the repo |
|
| 113 |
| `trace_calls` | Walk the call chain from a function to see what it calls and what calls it |
|
| 114 |
| `note` | Store a key-value fact in working memory for this session |
|
|
|
|
| 11 |
|
| 12 |
**A production-grade RAG system that maps any GitHub repository β built from scratch, without LangChain or LlamaIndex.**
|
| 13 |
|
| 14 |
+
Index any public repo and ask natural-language questions about its code. Cartographer retrieves the exact functions and classes relevant to your question, explains them with source citations, and can autonomously investigate complex questions across multiple files using an agent with 12 specialised tools. It can also generate a full README for any indexed repo on demand.
|
| 15 |
|
| 16 |
**Live:** [cartographer-app.vercel.app](https://cartographer-app.vercel.app) Β· **Backend:** [HuggingFace Spaces](https://huggingface.co/spaces/umanggarg/cartographer)
|
| 17 |
|
|
|
|
| 23 |
|
| 24 |
- **Ingestion** β GitHub API β AST-based code chunking β contextual LLM descriptions β dual-vector embedding β Qdrant Cloud
|
| 25 |
- **Retrieval** β HyDE + query expansion + native hybrid search (dense + BM25) + cross-encoder reranking β each stage independently improving recall and precision
|
| 26 |
+
- **Agent** β a ReAct loop with 12 MCP tools, working memory, parallel tool execution, and streaming thought traces
|
| 27 |
- **UI** β the pipeline is visible: every retrieved chunk, agent thought, tool call, and confidence grade is shown to the user
|
| 28 |
|
| 29 |
The result is both a useful tool and a study in how production AI systems are actually built.
|
|
|
|
| 88 |
Cerebras llama-3.3-70b (2600 tok/s, fastest) β Groq β Gemini β OpenRouter β Anthropic
|
| 89 |
```
|
| 90 |
|
| 91 |
+
**Two-tier LLM strategy.** The free cascade above serves all runtime traffic β Q&A, agent mode, diagrams. A second, opt-in **premium tier** uses Claude Sonnet 4.6 to generate cached artifacts (concept tour, diagrams, README, repo_map) once per repo. Outputs are persisted in a `_artifacts` Qdrant collection and survive container restarts, so subsequent visitors get the high-quality artifacts at the free-cascade cost.
|
| 92 |
+
|
| 93 |
---
|
| 94 |
|
| 95 |
## Agent Mode
|
|
|
|
| 102 |
|
| 103 |
This means every tool works with any MCP-compatible client, not just our agent.
|
| 104 |
|
| 105 |
+
### 12 Agent Tools
|
| 106 |
|
| 107 |
| Tool | What it does |
|
| 108 |
|------|-------------|
|
|
|
|
| 111 |
| `read_file` | Read any indexed file in full |
|
| 112 |
| `get_file_chunk` | Read a precise line range from a file |
|
| 113 |
| `list_files` | List all indexed files in a repo or subdirectory |
|
| 114 |
+
| `glob` | Find files matching a glob pattern across the indexed repo |
|
| 115 |
+
| `grep` | Regex search across indexed file contents |
|
| 116 |
| `find_callers` | Find every call site of a function across the repo |
|
| 117 |
| `trace_calls` | Walk the call chain from a function to see what it calls and what calls it |
|
| 118 |
| `note` | Store a key-value fact in working memory for this session |
|
backend/services/diagram_service.py
CHANGED
|
@@ -474,18 +474,23 @@ class DiagramService:
|
|
| 474 |
# Architecture, Class, Data Flow: ground-truth structure from AST,
|
| 475 |
# LLM only writes node descriptions.
|
| 476 |
data = self._build_static_graph(repo, diagram_type)
|
|
|
|
| 477 |
if not data or not data.get("nodes"):
|
| 478 |
# Fallback to LLM if static analysis yields nothing
|
| 479 |
# (e.g. non-Python repo with no AST data)
|
| 480 |
data = self._build_diagram_from_llm(repo, diagram_type, chunks)
|
| 481 |
else:
|
| 482 |
-
data = self._enrich_nodes(repo, diagram_type, data, chunks)
|
| 483 |
|
| 484 |
if not data:
|
| 485 |
return {"error": "Could not generate diagram. Try regenerating."}
|
| 486 |
|
| 487 |
self._cache[cache_key] = data
|
| 488 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 489 |
return {"diagram": data, "type": diagram_type}
|
| 490 |
|
| 491 |
def build_tour(self, repo: str) -> dict:
|
|
@@ -689,6 +694,7 @@ class DiagramService:
|
|
| 689 |
return
|
| 690 |
|
| 691 |
yield {"stage": "building", "progress": 0.40, "message": "Building graph from ASTβ¦"}
|
|
|
|
| 692 |
if diagram_type == "sequence":
|
| 693 |
data = self._build_sequence_from_llm(repo, chunks)
|
| 694 |
else:
|
|
@@ -700,7 +706,7 @@ class DiagramService:
|
|
| 700 |
else:
|
| 701 |
yield {"stage": "enriching", "progress": 0.70,
|
| 702 |
"message": "Enriching node descriptions with AIβ¦"}
|
| 703 |
-
data = self._enrich_nodes(repo, diagram_type, data, chunks)
|
| 704 |
|
| 705 |
if not data:
|
| 706 |
yield {"stage": "error", "progress": 1.0,
|
|
@@ -708,7 +714,11 @@ class DiagramService:
|
|
| 708 |
return
|
| 709 |
|
| 710 |
self._cache[cache_key] = data
|
| 711 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 712 |
yield {"stage": "done", "progress": 1.0, "diagram": data, "type": diagram_type}
|
| 713 |
|
| 714 |
def invalidate(self, repo: str):
|
|
@@ -1022,7 +1032,7 @@ class DiagramService:
|
|
| 1022 |
|
| 1023 |
return {"nodes": nodes, "edges": edges}
|
| 1024 |
|
| 1025 |
-
def _enrich_nodes(self, repo: str, diagram_type: str, graph: dict, chunks: list[dict] | None = None) -> dict:
|
| 1026 |
"""
|
| 1027 |
Ask the LLM to write a short description for each node.
|
| 1028 |
|
|
@@ -1034,12 +1044,17 @@ class DiagramService:
|
|
| 1034 |
snippets per node. Without this, the LLM only sees the node name and file
|
| 1035 |
and has to guess what the component does β the most common source of
|
| 1036 |
inaccurate descriptions. With snippets, descriptions are grounded in real code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1037 |
"""
|
| 1038 |
import json as _json
|
| 1039 |
|
| 1040 |
nodes = graph.get("nodes", [])
|
| 1041 |
if not nodes:
|
| 1042 |
-
return graph
|
| 1043 |
|
| 1044 |
# Two lookups so we can attach real code for both diagram types:
|
| 1045 |
#
|
|
@@ -1116,11 +1131,12 @@ class DiagramService:
|
|
| 1116 |
n["description"] = desc
|
| 1117 |
matched = sum(1 for n in nodes if n.get("description"))
|
| 1118 |
print(f"DiagramService: enriched {matched}/{len(nodes)} nodes with descriptions")
|
|
|
|
| 1119 |
except Exception as e:
|
| 1120 |
print(f"DiagramService: enrichment failed (non-fatal): {e}")
|
| 1121 |
-
# Descriptions stay empty β diagram still shows accurate structure
|
| 1122 |
-
|
| 1123 |
-
|
| 1124 |
|
| 1125 |
# ββ LLM-based builders ββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 1126 |
|
|
|
|
| 474 |
# Architecture, Class, Data Flow: ground-truth structure from AST,
|
| 475 |
# LLM only writes node descriptions.
|
| 476 |
data = self._build_static_graph(repo, diagram_type)
|
| 477 |
+
enriched_ok = True # LLM-built path: success implied by non-empty data
|
| 478 |
if not data or not data.get("nodes"):
|
| 479 |
# Fallback to LLM if static analysis yields nothing
|
| 480 |
# (e.g. non-Python repo with no AST data)
|
| 481 |
data = self._build_diagram_from_llm(repo, diagram_type, chunks)
|
| 482 |
else:
|
| 483 |
+
data, enriched_ok = self._enrich_nodes(repo, diagram_type, data, chunks)
|
| 484 |
|
| 485 |
if not data:
|
| 486 |
return {"error": "Could not generate diagram. Try regenerating."}
|
| 487 |
|
| 488 |
self._cache[cache_key] = data
|
| 489 |
+
# Only label the save with the active model if the LLM step actually
|
| 490 |
+
# succeeded. On enrichment failure, save with model=None so the
|
| 491 |
+
# protection rule can replace this degraded artifact later.
|
| 492 |
+
save_model = self._gen.current_model() if enriched_ok else None
|
| 493 |
+
self._save_diagram(repo, diagram_type, data, model=save_model)
|
| 494 |
return {"diagram": data, "type": diagram_type}
|
| 495 |
|
| 496 |
def build_tour(self, repo: str) -> dict:
|
|
|
|
| 694 |
return
|
| 695 |
|
| 696 |
yield {"stage": "building", "progress": 0.40, "message": "Building graph from ASTβ¦"}
|
| 697 |
+
enriched_ok = True
|
| 698 |
if diagram_type == "sequence":
|
| 699 |
data = self._build_sequence_from_llm(repo, chunks)
|
| 700 |
else:
|
|
|
|
| 706 |
else:
|
| 707 |
yield {"stage": "enriching", "progress": 0.70,
|
| 708 |
"message": "Enriching node descriptions with AIβ¦"}
|
| 709 |
+
data, enriched_ok = self._enrich_nodes(repo, diagram_type, data, chunks)
|
| 710 |
|
| 711 |
if not data:
|
| 712 |
yield {"stage": "error", "progress": 1.0,
|
|
|
|
| 714 |
return
|
| 715 |
|
| 716 |
self._cache[cache_key] = data
|
| 717 |
+
# See build_diagram() for rationale: skip premium label if the LLM
|
| 718 |
+
# enrichment call silently failed, otherwise the protection rule
|
| 719 |
+
# treats a structurally-only-correct diagram as premium-quality.
|
| 720 |
+
save_model = self._gen.current_model() if enriched_ok else None
|
| 721 |
+
self._save_diagram(repo, diagram_type, data, model=save_model)
|
| 722 |
yield {"stage": "done", "progress": 1.0, "diagram": data, "type": diagram_type}
|
| 723 |
|
| 724 |
def invalidate(self, repo: str):
|
|
|
|
| 1032 |
|
| 1033 |
return {"nodes": nodes, "edges": edges}
|
| 1034 |
|
| 1035 |
+
def _enrich_nodes(self, repo: str, diagram_type: str, graph: dict, chunks: list[dict] | None = None) -> tuple[dict, bool]:
|
| 1036 |
"""
|
| 1037 |
Ask the LLM to write a short description for each node.
|
| 1038 |
|
|
|
|
| 1044 |
snippets per node. Without this, the LLM only sees the node name and file
|
| 1045 |
and has to guess what the component does β the most common source of
|
| 1046 |
inaccurate descriptions. With snippets, descriptions are grounded in real code.
|
| 1047 |
+
|
| 1048 |
+
Returns (graph, enriched_ok). enriched_ok=False means the LLM call failed
|
| 1049 |
+
and descriptions are missing; callers should not label the save with the
|
| 1050 |
+
configured premium model in that case (otherwise the protection rule trusts
|
| 1051 |
+
a degraded artifact as if it were premium-quality).
|
| 1052 |
"""
|
| 1053 |
import json as _json
|
| 1054 |
|
| 1055 |
nodes = graph.get("nodes", [])
|
| 1056 |
if not nodes:
|
| 1057 |
+
return graph, True
|
| 1058 |
|
| 1059 |
# Two lookups so we can attach real code for both diagram types:
|
| 1060 |
#
|
|
|
|
| 1131 |
n["description"] = desc
|
| 1132 |
matched = sum(1 for n in nodes if n.get("description"))
|
| 1133 |
print(f"DiagramService: enriched {matched}/{len(nodes)} nodes with descriptions")
|
| 1134 |
+
return graph, True
|
| 1135 |
except Exception as e:
|
| 1136 |
print(f"DiagramService: enrichment failed (non-fatal): {e}")
|
| 1137 |
+
# Descriptions stay empty β diagram still shows accurate structure,
|
| 1138 |
+
# but signal failure so the save site doesn't tag this as premium.
|
| 1139 |
+
return graph, False
|
| 1140 |
|
| 1141 |
# ββ LLM-based builders ββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 1142 |
|
scripts/prebake_repos.py
CHANGED
|
@@ -184,14 +184,20 @@ def bake_one(
|
|
| 184 |
started = time.monotonic()
|
| 185 |
if not ingest(repo, store, gen, embedder):
|
| 186 |
return False
|
| 187 |
-
|
| 188 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 189 |
for dtype in DIAGRAM_TYPES:
|
| 190 |
-
bake_diagram(repo, dtype, diagram_svc, force)
|
| 191 |
-
bake_readme(repo, readme_svc, store, force)
|
| 192 |
elapsed = time.monotonic() - started
|
|
|
|
|
|
|
| 193 |
print(f" β± {elapsed:.1f}s")
|
| 194 |
-
return
|
| 195 |
|
| 196 |
|
| 197 |
def main() -> int:
|
|
|
|
| 184 |
started = time.monotonic()
|
| 185 |
if not ingest(repo, store, gen, embedder):
|
| 186 |
return False
|
| 187 |
+
# Each bake step returns False on failure. Track them all so the
|
| 188 |
+
# final exit code reflects whether the repo is *actually* fully
|
| 189 |
+
# baked, not just whether ingestion succeeded.
|
| 190 |
+
failures: list[str] = []
|
| 191 |
+
if not bake_repo_map(repo, repo_map_svc, force): failures.append("repo_map")
|
| 192 |
+
if not bake_tour(repo, diagram_svc, force): failures.append("tour")
|
| 193 |
for dtype in DIAGRAM_TYPES:
|
| 194 |
+
if not bake_diagram(repo, dtype, diagram_svc, force): failures.append(f"diagram:{dtype}")
|
| 195 |
+
if not bake_readme(repo, readme_svc, store, force): failures.append("readme")
|
| 196 |
elapsed = time.monotonic() - started
|
| 197 |
+
if failures:
|
| 198 |
+
print(f" β partial: {len(failures)} step(s) failed β {', '.join(failures)}")
|
| 199 |
print(f" β± {elapsed:.1f}s")
|
| 200 |
+
return not failures
|
| 201 |
|
| 202 |
|
| 203 |
def main() -> int:
|