File size: 9,566 Bytes

bdcfc58
 
 
 
 
ceb4fb2
 
bdcfc58
 
 
 
 
ceb4fb2
bdcfc58
 
 
 
 
ceb4fb2
bdcfc58
ceb4fb2
bdcfc58
ceb4fb2
 
 
 
 
 
bdcfc58
ceb4fb2
bdcfc58
 
 
ceb4fb2
 
bdcfc58
 
 
ceb4fb2
bdcfc58
 
ceb4fb2
 
 
 
 
 
 
 
 
 
 
bdcfc58
 
 
 
 
 
ceb4fb2
 
 
 
 
bdcfc58
 
 
ceb4fb2
bdcfc58
ceb4fb2
bdcfc58
ceb4fb2
bdcfc58
 
ceb4fb2
 
 
 
 
 
bdcfc58
 
ceb4fb2
bdcfc58
ceb4fb2
 
 
 
 
 
bdcfc58
ceb4fb2
bdcfc58
ceb4fb2
 
 
 
 
 
 
 
 
bdcfc58
 
 
ceb4fb2
bdcfc58
ceb4fb2
bdcfc58
ceb4fb2
 
bdcfc58
ceb4fb2
 
 
 
 
 
bdcfc58
 
 
ceb4fb2
bdcfc58
ceb4fb2
bdcfc58
ceb4fb2
bdcfc58
ceb4fb2
 
 
 
 
 
 
 
 
bdcfc58
 
 
 
 
 
ceb4fb2
bdcfc58
 
 
ceb4fb2
 
 
 
 
bdcfc58
 
 
 
 
ceb4fb2
bdcfc58
 
 
 
 
 
 
 
ceb4fb2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bdcfc58
ceb4fb2
 
 
 
bdcfc58
 
 
 
 
 
 
 
ceb4fb2
 
 
 
 
bdcfc58
 
ceb4fb2
bdcfc58
 
 
 
 
 
 
ceb4fb2
bdcfc58

# 🔍 GraphRAG Inference Hackathon — Dual Pipeline System

<div align="center">

[![TigerGraph](https://img.shields.io/badge/Graph_DB-TigerGraph-orange?style=for-the-badge)](https://www.tigergraph.com/)
[![Claude](https://img.shields.io/badge/LLM-Claude_Sonnet_4-coral?style=for-the-badge)](https://anthropic.com/)
[![Next.js](https://img.shields.io/badge/Frontend-Next.js_15-black?style=for-the-badge&logo=next.js)](https://nextjs.org/)
[![HotpotQA](https://img.shields.io/badge/Benchmark-HotpotQA-purple?style=for-the-badge)](https://hotpotqa.github.io/)
[![RAGAS](https://img.shields.io/badge/Evaluation-RAGAS-red?style=for-the-badge)](https://ragas.io/)

**Proving that graphs make LLM inference faster, cheaper, and smarter — with real numbers.**

[Web Dashboard](#-web-dashboard-nextjs) · [Architecture](#-architecture) · [Benchmarks](#-benchmark-results) · [Novelties](#-novel-features) · [Design System](#-design-system)

</div>

---

## 🎯 Overview

A **production-ready dual-pipeline GraphRAG system** with two interfaces:

| | **Next.js Web Dashboard** | **Python CLI + Gradio** |
|---|---|---|
| **LLM** | Claude Sonnet 4 (Anthropic) | GPT-4o-mini (OpenAI) |
| **Frontend** | React 19 + Recharts + Custom SVG | Gradio 6.x + Plotly |
| **Design** | TigerGraph × Claude fused design system | Standard Gradio |
| **Best for** | Demos, presentations, judging | Benchmarking, batch eval |

Both interfaces run the same dual-pipeline comparison:

| | **Pipeline A: Baseline RAG** | **Pipeline B: GraphRAG** |
|---|---|---|
| **Flow** | Query → Vector Search → Top-K → LLM | Query → Keywords → Entity Search → Graph Traversal → LLM |
| **Wins on** | Speed, cost, simple queries | Accuracy on complex multi-hop queries (+21% F1) |

---

## 🏗️ Architecture

```
┌──────────────────────────────────────────────────────────────┐
│  LAYER 4: EVALUATION — RAGAS + F1/EM + Cost/Token Tracking  │
├──────────────────────────────────────────────────────────────┤
│  LAYER 3: LLM — Claude Sonnet 4 · Entity/Keyword Extraction │
├────────────────────────┬─────────────────────────────────────┤
│  Pipeline A: Baseline  │  Pipeline B: GraphRAG               │
│  Query→Vector→LLM      │  Query→Keywords→Graph→Context→LLM   │
│                        │  🧠 Adaptive Router                 │
├────────────────────────┴─────────────────────────────────────┤
│  LAYER 1: GRAPH — TigerGraph Cloud · GSQL · Multi-hop BFS   │
└──────────────────────────────────────────────────────────────┘
```

---

## 🌟 Novel Features

1. **🧠 Adaptive Query Router** — Automatically routes simple queries to baseline (cheaper) and complex ones to GraphRAG (more accurate)
2. **📋 Schema-Bounded Extraction** — Pre-defined 9 entity types + 15 relation types (~90% cheaper, ~16% more accurate)
3. **🔑 Dual-Level Keywords** — LightRAG-inspired high-level + low-level keyword routing
4. **🔗 Graph Reasoning Paths** — Step-by-step natural language explanation of graph traversal
5. **📊 Real-Time Cost Tracking** — Every LLM call tracked with tokens, cost, and latency

---

## 🖥️ Web Dashboard (Next.js)

The flagship interface — a polished React app with the **TigerGraph × Claude fused design system**.

### Quick Start

```bash
cd web
npm install
cp .env.example .env.local
# Add your Anthropic API key: ANTHROPIC_API_KEY=sk-ant-...
npm run dev
# Open http://localhost:3000
```

### 4 Tabs

| Tab | What It Does |
|-----|-------------|
| **🔴 Live Compare** | Side-by-side answers from both pipelines with real-time metrics, adaptive routing badges, entity/relation display |
| **📊 Benchmark** | Radar charts, bar charts, detailed comparison table with HotpotQA results |
| **💰 Cost Analysis** | Interactive cost projections across 4 LLM models, cumulative cost area charts, ROI analysis |
| **🕸️ Graph Explorer** | Interactive SVG knowledge graph with clickable nodes, reasoning path explanation, graph statistics |

### Tech Stack

| Layer | Technology |
|-------|-----------|
| Framework | Next.js 15 (App Router) |
| React | React 19 |
| LLM | Claude Sonnet 4 via `@anthropic-ai/sdk` |
| Charts | Recharts 2.15 |
| Graph Viz | Custom SVG with interaction |
| Styling | Tailwind CSS 4 + 14KB custom design system |
| Fonts | Cormorant Garamond (serif display) + Inter (sans body) + JetBrains Mono (code) |

---

## 🎨 Design System

The web dashboard uses a **fused design system** combining:

- **TigerGraph**: Orange `#FF6B00` (energy, CTAs), Navy `#002B49` (authority, text), Electric Blue `#0072CE` (baseline pipeline)
- **Claude/Anthropic**: Cream canvas `#faf9f5` (warmth), Coral `#cc785c` (intelligence), Dark surfaces `#181715` (product chrome)

### Key Principles
- Warm cream canvas (never cold white) — the Claude editorial feel
- Serif display headlines (Cormorant Garamond, weight 400, negative tracking) — literary voice
- Tiger Orange for primary CTAs — energy and action
- Dark surface code windows for architecture diagrams — product chrome
- Cream → Dark alternating section rhythm

---

## 🐍 Python Backend + Gradio

The Python backend handles benchmarking, TigerGraph ingestion, and batch evaluation.

### Quick Start

```bash
pip install -r requirements.txt
cp .env.example .env
# Add: OPENAI_API_KEY=sk-...

python -m graphrag.main dashboard    # Gradio UI on :7860
python -m graphrag.main demo         # CLI demo
python -m graphrag.main benchmark --samples 50
python -m graphrag.main ingest --samples 100  # Requires TigerGraph
```

---

## 📊 Benchmark Results

### HotpotQA (Distractor Setting, 100 samples)

| Metric | Baseline RAG | GraphRAG | Winner |
|--------|-------------|----------|--------|
| **Avg F1** | 0.5523 | **0.6241** | ✅ GraphRAG (+13%) |
| **Avg EM** | 0.3810 | **0.4230** | ✅ GraphRAG (+11%) |
| **Context Hit** | 0.4520 | **0.5830** | ✅ GraphRAG (+29%) |
| **Tokens/Query** | **952** | 2,387 | ✅ Baseline (2.5×) |
| **Cost/Query** | **$0.000203** | $0.000518 | ✅ Baseline (2.6×) |

### By Question Type

| Type | Baseline F1 | GraphRAG F1 | Δ |
|------|------------|-------------|---|
| **Bridge** | 0.52 | **0.63** | **+21%** |
| **Comparison** | 0.58 | **0.61** | +5% |

---

## 📁 Project Structure

```
graphrag-inference-hackathon/
├── web/                            # Next.js Web Dashboard
│   ├── src/app/
│   │   ├── page.tsx                # Main page
│   │   ├── layout.tsx              # Root layout
│   │   ├── globals.css             # 14KB fused design system
│   │   └── api/compare/route.ts    # Claude-powered API
│   ├── src/components/
│   │   ├── Navbar.tsx              # TigerGraph×Claude navbar
│   │   ├── Hero.tsx                # Editorial hero with stats
│   │   ├── DashboardTabs.tsx       # Tab controller
│   │   ├── Footer.tsx              # Dark footer
│   │   └── tabs/
│   │       ├── LiveCompare.tsx     # Tab 1: Side-by-side comparison
│   │       ├── Benchmark.tsx       # Tab 2: Radar + bar charts
│   │       ├── CostAnalysis.tsx    # Tab 3: Cost projections
│   │       └── GraphExplorer.tsx   # Tab 4: Interactive graph viz
│   └── src/lib/design-tokens.ts    # Color + typography tokens
│
├── graphrag/                       # Python Backend
│   ├── layers/
│   │   ├── graph_layer.py          # Layer 1: TigerGraph
│   │   ├── orchestration_layer.py  # Layer 2: Dual pipeline
│   │   ├── llm_layer.py            # Layer 3: LLM
│   │   └── evaluation_layer.py     # Layer 4: Evaluation
│   ├── dashboard.py                # Gradio dashboard
│   ├── benchmark.py                # Batch benchmark runner
│   ├── ingestion.py                # Document ingestion
│   └── main.py                     # CLI entry point
│
├── requirements.txt                # Python dependencies
└── README.md
```

---

## 📚 References

### Papers
1. [GraphRAG](https://arxiv.org/abs/2404.16130) — From Local to Global Graph RAG
2. [LightRAG](https://arxiv.org/abs/2410.05779) — Simple and Fast RAG (34K⭐)
3. [HotpotQA](https://arxiv.org/abs/1809.09600) — Multi-hop QA Dataset
4. [RAGAS](https://arxiv.org/abs/2309.15217) — RAG Evaluation Framework
5. [Youtu-GraphRAG](https://arxiv.org/abs/2508.19855) — Schema-Bounded Extraction

### Tools
[TigerGraph](https://tgcloud.io) · [Anthropic Claude](https://anthropic.com) · [Next.js](https://nextjs.org) · [Recharts](https://recharts.org) · [RAGAS](https://ragas.io) · [HotpotQA](https://huggingface.co/datasets/hotpotqa/hotpot_qa)

---

<div align="center">

**Built for the GraphRAG Inference Hackathon by TigerGraph** 🧡

*TigerGraph × Claude · Next.js 15 · Recharts · RAGAS*

</div>