muthuk1's picture
Final README update with Docker deployment, test instructions, live benchmark, provider selector docs
10b2275 verified
|
raw
history blame
10.9 kB

πŸ” GraphRAG Inference Hackathon β€” Dual Pipeline System

TigerGraph 12 LLMs OpenClaw Ollama Next.js Tests

Proving that graphs make LLM inference faster, cheaper, and smarter β€” with any LLM provider.

Quick Start Β· 12 Providers Β· OpenClaw Β· Architecture Β· Benchmarks Β· Deploy


πŸš€ Quick Start

Option A: Next.js Dashboard (Recommended)

cd web
npm install
cp .env.example .env.local
# Set ANY provider key β€” or just use Ollama for free:
npm run dev
# β†’ http://localhost:3000

Option B: Docker (One Command)

docker build -t graphrag .
docker run -p 3000:3000 -e ANTHROPIC_API_KEY=sk-ant-... graphrag

Option C: Python CLI

pip install -r requirements.txt
python -m graphrag.main demo

Option D: Ollama (100% Free, Local)

ollama pull llama3.2
cd web && npm install && npm run dev
# Select "Ollama (Local)" in provider dropdown

πŸ—οΈ Architecture (AI Factory Model β€” 4 Layers)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  LAYER 4: EVALUATION                                                  β”‚
β”‚  Next.js Dashboard β”‚ RAGAS β”‚ F1/EM β”‚ Cost Tracking β”‚ Live Benchmark  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  LAYER 3: UNIVERSAL LLM (12 Providers)                                β”‚
β”‚  OpenAI β”‚ Claude β”‚ Gemini β”‚ Mistral β”‚ Ollama β”‚ Groq β”‚ DeepSeek β”‚ …  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Pipeline A: Baseline RAG  β”‚  Pipeline B: GraphRAG                   β”‚
β”‚  Query β†’ Vector β†’ LLM      β”‚  Query β†’ Keywords β†’ Graph β†’ Context β†’ LLM β”‚
β”‚                            β”‚  🧠 Adaptive Router β”‚ πŸ”— Reasoning Paths β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  LAYER 1: GRAPH (TigerGraph Cloud)                                    β”‚
β”‚  Schema: Document β†’ Chunk β†’ Entity β†’ Community                       β”‚
β”‚  GSQL: vectorSearchChunks β”‚ vectorSearchEntities β”‚ graphRAGTraverse   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Each layer is a separate module β€” swap TigerGraph for Neo4j, Claude for Ollama, or RAGAS for custom evals without touching other layers.


πŸ€– Supported LLM Providers

# Provider Default Model Cost/1K tokens Speed
1 OpenAI gpt-4o-mini $0.00015 in / $0.0006 out ⚑ Fast
2 Anthropic Claude claude-sonnet-4 $0.003 / $0.015 πŸ”΅ Medium
3 Google Gemini gemini-2.0-flash $0.0001 / $0.0004 ⚑ Fast
4 Mistral AI mistral-large $0.002 / $0.006 πŸ”΅ Medium
5 Cohere command-r-plus $0.0025 / $0.01 πŸ”΅ Medium
6 πŸ¦™ Ollama llama3.2 $0 / $0 ⚑ Local
7 OpenRouter llama-3.3-70b $0.0004 / $0.0004 πŸ”΅ Medium
8 Groq llama-3.3-70b $0.0006 / $0.0008 ⚑⚑ Blazing
9 xAI Grok grok-3-mini $0.0003 / $0.0005 ⚑ Fast
10 Together AI llama-3.1-70b $0.0009 / $0.0009 ⚑ Fast
11 HuggingFace llama-3.3-70b $0 / $0 πŸ”΅ Medium
12 DeepSeek deepseek-chat $0.00014 / $0.00028 ⚑ Fast

How: All providers use OpenAI SDK with dynamic baseURL β€” zero extra dependencies. Switch providers from the dropdown in the dashboard UI.


🌟 Novel Features

  1. 🧠 Adaptive Query Router β€” complexity scoring β†’ auto pipeline selection
  2. πŸ“‹ Schema-Bounded Extraction β€” 9 entity types + 15 relation types
  3. πŸ”‘ Dual-Level Keywords β€” LightRAG-inspired high/low-level retrieval
  4. πŸ”— Graph Reasoning Paths β€” step-by-step NL traversal explanation
  5. πŸ€– 12-Provider Universal LLM β€” including free Ollama local
  6. 🦞 OpenClaw Agent Skills β€” GraphRAG as autonomous agent capabilities
  7. πŸ“Š Live Benchmark Button β€” run real evaluations from the dashboard
  8. πŸ’° 12-Provider Cost Comparison β€” real-time projections

πŸ“Š Benchmarks

Live Benchmark (Run from Dashboard)

Click "πŸƒ Run Benchmark Now" in the Benchmark tab to evaluate both pipelines on 10 HotpotQA questions with your configured provider. Results populate real-time with F1, EM, token counts, costs.

Expected Results (HotpotQA)

Metric Baseline RAG GraphRAG Winner
F1 Score ~0.45–0.60 ~0.55–0.70 βœ… GraphRAG
Exact Match ~0.30–0.45 ~0.35–0.50 βœ… GraphRAG
Tokens/Query ~800–1000 ~2000–2800 βœ… Baseline
F1 Win Rate β€” ~55–70% βœ… GraphRAG

Key Finding: GraphRAG consistently outperforms baseline on multi-hop questions (bridge type) where connecting facts across documents is required. The token overhead is 2–3Γ—, but the Adaptive Router eliminates this cost for simple queries.


🦞 OpenClaw Integration

Full CIK model (Capability + Identity + Knowledge):

File Purpose
openclaw/SOUL.md Agent identity, values, personality
openclaw/IDENTITY.md Configuration, supported providers
openclaw/MEMORY.md Learned facts about GraphRAG
openclaw/skills/graph_query/ NL β†’ knowledge graph traversal
openclaw/skills/compare_pipelines/ Dual-pipeline comparison
openclaw/skills/cost_estimate/ 12-provider cost projection

πŸ§ͺ Testing

# Run all 31 unit tests
python tests/test_core.py

# Tests cover:
# - cosine_similarity (5 cases including edge cases)
# - chunk_text (4 cases: basic, empty, short, overlap)
# - entity ID generation (3 cases: deterministic, case-insensitive, type-different)
# - F1/EM computation (5 cases: perfect, partial, no overlap, empty)
# - context hit rate (2 cases)
# - token efficiency (3 cases)
# - provider registry (4 cases: completeness, fields, ollama free, available)
# - evaluation layer aggregate + report (2 cases)

🐳 Deployment

Docker

docker build -t graphrag .
docker run -p 3000:3000 \
  -e ANTHROPIC_API_KEY=sk-ant-... \
  -e OPENAI_API_KEY=sk-... \
  graphrag

Vercel

cd web
npx vercel --prod

Env Variables

# Set any/all β€” system auto-detects available providers
ANTHROPIC_API_KEY=sk-ant-...   # Claude
OPENAI_API_KEY=sk-...          # GPT-4o
GEMINI_API_KEY=AIza...         # Gemini
GROQ_API_KEY=gsk_...           # Groq (ultra-fast)
DEEPSEEK_API_KEY=sk-...        # DeepSeek (cheapest)
# Or: ollama pull llama3.2     # Free, local

πŸ“ Project Structure (68 files, 240KB)

β”œβ”€β”€ web/                            # Next.js 15 Dashboard
β”‚   β”œβ”€β”€ src/app/
β”‚   β”‚   β”œβ”€β”€ globals.css             # 14KB fused TigerGraphΓ—Claude design system
β”‚   β”‚   └── api/
β”‚   β”‚       β”œβ”€β”€ compare/route.ts    # Multi-provider dual-pipeline API
β”‚   β”‚       β”œβ”€β”€ benchmark/route.ts  # Live benchmark runner with F1/EM
β”‚   β”‚       └── providers/route.ts  # Available providers + Ollama health
β”‚   β”œβ”€β”€ src/components/tabs/
β”‚   β”‚   β”œβ”€β”€ LiveCompare.tsx         # Provider selector + side-by-side comparison
β”‚   β”‚   β”œβ”€β”€ Benchmark.tsx           # Live "Run Now" + radar/bar charts
β”‚   β”‚   β”œβ”€β”€ CostAnalysis.tsx        # 12-provider cost projections
β”‚   β”‚   └── GraphExplorer.tsx       # Interactive SVG knowledge graph
β”‚   └── src/lib/
β”‚       β”œβ”€β”€ llm-providers.ts        # 12-provider universal client (18KB)
β”‚       └── design-tokens.ts        # Color + typography tokens
β”‚
β”œβ”€β”€ openclaw/                       # OpenClaw Agent (CIK model)
β”‚   β”œβ”€β”€ SOUL.md / IDENTITY.md / MEMORY.md
β”‚   └── skills/ (3 skills)
β”‚
β”œβ”€β”€ graphrag/                       # Python Backend
β”‚   └── layers/
β”‚       β”œβ”€β”€ graph_layer.py          # TigerGraph schema + GSQL
β”‚       β”œβ”€β”€ orchestration_layer.py  # Dual pipeline + adaptive router
β”‚       β”œβ”€β”€ llm_layer.py            # LLM interactions
β”‚       β”œβ”€β”€ evaluation_layer.py     # RAGAS + F1/EM
β”‚       └── universal_llm.py        # LiteLLM 12-provider support
β”‚
β”œβ”€β”€ tests/test_core.py              # 31 unit tests
β”œβ”€β”€ Dockerfile                      # One-command deployment
└── README.md

πŸ“š References

  1. GraphRAG β€” From Local to Global
  2. LightRAG β€” Simple and Fast (34K⭐)
  3. OpenClaw β€” Personal AI Agent
  4. HotpotQA β€” Multi-hop QA
  5. RAGAS β€” RAG Evaluation
  6. Youtu-GraphRAG β€” Schema-Bounded

TigerGraph Β· Anthropic Β· Ollama Β· Groq Β· LiteLLM Β· Next.js Β· Recharts


πŸ† Built for the GraphRAG Inference Hackathon by TigerGraph

12 LLM Providers Β· OpenClaw Agent Β· Ollama Local Β· TigerGraph Β· Next.js 15 Β· 31 Unit Tests Β· Docker