Spaces:

anky2002
/

graph-rag

Configuration error

App Files Files Community

graph-rag / ARCHITECTURE.md

GitHub Action

Automated sync to Hugging Face

b602961 44 minutes ago

preview code

raw

history blame contribute delete

10.1 kB

Graph RAG Service - Project Documentation

System Architecture

Overview

The Graph RAG Service is built as a modular, production-grade platform with the following key components:

API Gateway (FastAPI): Handles all HTTP requests, authentication, and routing
Ingestion Pipeline: Processes documents and constructs knowledge graphs
Retrieval Agent (LangGraph): Intelligent query routing and response synthesis
Storage Layer: Neo4j for graph + vector storage
Task Queue: Celery + Redis for async processing
Observability: OpenTelemetry for tracing and metrics

Design Principles

1. No Vendor Lock-in

All core components are abstracted behind interfaces:

GraphStore: Can swap Neo4j for AWS Neptune
VectorStore: Supports multiple vector databases
LLMProvider: Works with any LLM (OpenAI, Anthropic, Gemini, Ollama)

2. Production-Ready

Async Processing: Non-blocking I/O for all database operations
Background Jobs: Celery workers handle heavy ingestion tasks
Authentication: JWT-based with RBAC support
Error Handling: Graceful degradation and fallback mechanisms
Observability: Full tracing and metrics collection

3. Intelligent Retrieval

The agentic system:

Decomposes complex queries into sub-queries
Dynamically selects retrieval methods (vector vs graph vs cypher)
Validates outputs against schema (hallucination guard)
Provides reasoning chains for transparency

Components Deep Dive

Core Abstractions (`src/graph_rag_service/core/`)

GraphStore Interface

class GraphStore(ABC):
    @abstractmethod
    async def create_node(entity: Entity) -> str
    @abstractmethod
    async def create_relationship(relationship: Relationship) -> str
    @abstractmethod
    async def execute_query(query: str, params: dict) -> List[dict]
    @abstractmethod
    async def find_path(source: str, target: str, max_depth: int) -> List[dict]

Implementation: Neo4jStore provides unified graph + vector storage using Neo4j 5.x vector capabilities.

LLMProvider Interface

class LLMProvider(ABC):
    @abstractmethod
    async def complete(prompt: str, **kwargs) -> str
    @abstractmethod
    async def embed(text: str) -> List[float]

Implementation: UnifiedLLMProvider wraps OpenAI, Anthropic, Gemini, and Ollama with a consistent interface.

Entity Resolution

Multi-stage resolution:

Blocking: Group by entity type and name similarity (fast reject)
Semantic Check: Compare embeddings for deep similarity
Threshold Matching: Configurable thresholds (0.85 default)
Auto-merge: High confidence merges (>0.95)
Human Review Queue: Medium confidence flagged for review (0.85-0.95)

Ingestion Pipeline (`src/graph_rag_service/ingestion/`)

Flow

Document Processing: Extract text from PDF/TXT/MD/DOCX
Chunking: Split into overlapping chunks (1024 tokens, 200 overlap)
Ontology Generation: LLM analyzes samples to propose entity/relationship types
Entity Extraction: Extract entities and relationships per chunk
Entity Resolution: Deduplicate and merge entities
Embedding Generation: Create vector embeddings (BGE-M3)
Graph Construction: Store in Neo4j with hybrid nodes

Hybrid Nodes

Each chunk is stored as both:

A (:Chunk) node with text and embedding
Connected to (:Entity) nodes via [:MENTIONS] relationships

This preserves source text for grounding while enabling abstract graph queries.

Retrieval System (`src/graph_rag_service/retrieval/`)

Tools

VectorSearchTool: Semantic similarity using embeddings
GraphTraversalTool: Relationship exploration and path finding
CypherGenerationTool: Text-to-Cypher with validation
MetadataFilterTool: Structured queries on attributes

Agent Workflow (LangGraph)

[Query] → [Decompose] → [Route] → [Vector/Graph/Cypher] → [Synthesize] → [Response]
                           ↑                                      ↓
                           └─────────────────────────────────────┘
                                    (Iterative refinement)

Hallucination Guards

Schema Injection: Prompt includes allowed entity/relationship types
Cypher Validation: Parse and validate against whitelist
Self-Correction: Feed errors back to LLM to fix syntax
Fallback: If graph fails, degrade to vector search

API Layer (`src/graph_rag_service/api/`)

Endpoints

POST /api/auth/login: Get JWT token
POST /api/documents/upload: Upload document (returns task ID)
GET /api/documents/status/{task_id}: Check ingestion progress
POST /api/query: Execute agentic query
GET /api/ontology: Get current ontology schema
PUT /api/ontology: Update ontology (admin only)
GET /api/graph/visualization: Get graph data for visualization
GET /api/system/health: System health check
GET /api/system/stats: System statistics

Authentication

JWT tokens with configurable expiration (default: 30 min)
RBAC with scopes: read, write, admin
Dependency injection for protected endpoints

Workers (`src/graph_rag_service/workers/`)

Celery Tasks

ingest_document: Process single document
ingest_documents_batch: Process multiple documents
health_check: Worker health verification

Configuration

Broker: Redis
Result Backend: Redis
Serializer: JSON
Task timeout: 1 hour (configurable)

Observability (`src/graph_rag_service/observability/`)

OpenTelemetry Integration

Traces: Agent reasoning steps, tool calls, database queries
Metrics:
- documents_ingested: Counter
- queries_executed: Counter
- query_duration_seconds: Histogram
- entities_extracted: Counter

Structured Logging

Log level: INFO (configurable)
Format: %(asctime)s - %(name)s - %(levelname)s - %(message)s
All async operations logged with context

Configuration

Environment Variables

Key settings in .env:

Neo4j: NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD
Redis: REDIS_HOST, REDIS_PORT
LLM Provider: DEFAULT_LLM_PROVIDER (openai/anthropic/gemini/ollama)
API Keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY
Ollama: OLLAMA_BASE_URL, OLLAMA_MODEL, OLLAMA_EMBEDDING_MODEL
Security: SECRET_KEY, ACCESS_TOKEN_EXPIRE_MINUTES

Tuning Parameters

CHUNK_SIZE: 1024 (text chunk size)
CHUNK_OVERLAP: 200 (overlap between chunks)
MAX_AGENT_ITERATIONS: 5 (max reasoning steps)
AGENT_TIMEOUT_SECONDS: 30 (query timeout)
ENTITY_RESOLUTION_THRESHOLD: 0.85 (similarity threshold)
DEFAULT_TOP_K: 5 (retrieval results)
GRAPH_MAX_DEPTH: 3 (graph traversal depth)

Deployment

Local Development

# 1. Ensure Neo4j and Redis are running
# 2. Configure .env with connection details

# 3. Start API server
./start-server.sh  # or start-server.bat on Windows

# 4. Start workers
./start-worker.sh  # or start-worker.bat on Windows

Production Considerations

Database: Use managed Neo4j (Aura) or self-hosted cluster
Redis: Use managed Redis (AWS ElastiCache, Redis Cloud)
Worker Scaling: Add more Celery workers based on ingestion load
API Scaling: Run multiple API instances behind load balancer
Monitoring: Integrate with Prometheus/Grafana for metrics
Secrets: Use secret management (AWS Secrets Manager, HashiCorp Vault)

Extensibility

Adding New LLM Provider

Implement LLMProvider interface
Add to LLMFactory.create() method
Update config with new provider settings

Adding New Graph Database

Implement GraphStore interface
Update IngestionPipeline to use new store
Test with existing workflows

Custom Retrieval Tools

Create new tool class with run() method
Add to AgentRetrievalSystem.tools
Update routing logic in _route_query()

Testing Strategy

Unit Tests

Test each component independently
Mock external dependencies (Neo4j, Redis, LLMs)
Focus on business logic

Integration Tests

Test component interactions
Use test database instances
Verify end-to-end flows

Performance Tests

Benchmark ingestion throughput
Measure query latencies
Stress test with concurrent requests

Future Enhancements

Phase 1 (Current MVP)

✅ Core ingestion pipeline
✅ Agentic retrieval system
✅ Multi-LLM support
✅ Entity resolution
✅ Async workers

Phase 2 (Next Steps)

React frontend with visual ontology editor
Graph visualization (D3.js/Cytoscape)
Advanced ontology evolution with migrations
Semantic cache with Redis
Batch ingestion optimization

Phase 3 (Advanced Features)

Multi-tenant support with data isolation
Fine-tuned entity extraction models
Graph neural network embeddings
Automated ontology quality metrics
Export/import ontology schemas

Troubleshooting

Common Issues

Neo4j Connection Failed

Verify Neo4j is running and accessible
Verify credentials in .env
Try connecting with cypher-shell: cypher-shell -u neo4j -p password

Celery Worker Not Processing

Check Redis is running: redis-cli ping
Verify broker URL in .env
Check worker logs

Ollama Models Not Found

Pull models: ollama pull llama3.2 && ollama pull bge-m3
Verify Ollama is running: curl http://localhost:11434/api/tags

Query Returns No Results

Verify documents are ingested: GET /api/system/stats
Check ontology exists: GET /api/ontology
Try simpler queries first

Support

For issues or questions:

Check documentation and troubleshooting guide
Search existing GitHub issues
Open new issue with:
- Clear description
- Steps to reproduce
- Environment details
- Relevant logs

Last Updated: February 2026 Version: 0.1.0

Graph RAG Service - Project Documentation

System Architecture

Overview

Design Principles

1. No Vendor Lock-in

2. Production-Ready

3. Intelligent Retrieval

Components Deep Dive

Core Abstractions (src/graph_rag_service/core/)

GraphStore Interface

LLMProvider Interface

Entity Resolution

Ingestion Pipeline (src/graph_rag_service/ingestion/)

Flow

Hybrid Nodes

Retrieval System (src/graph_rag_service/retrieval/)

Tools

Agent Workflow (LangGraph)

Hallucination Guards

API Layer (src/graph_rag_service/api/)

Endpoints

Authentication

Workers (src/graph_rag_service/workers/)

Celery Tasks

Configuration

Observability (src/graph_rag_service/observability/)

OpenTelemetry Integration

Structured Logging

Configuration

Environment Variables

Tuning Parameters

Deployment

Local Development

Production Considerations

Extensibility

Adding New LLM Provider

Adding New Graph Database

Custom Retrieval Tools

Testing Strategy

Unit Tests

Integration Tests

Performance Tests

Future Enhancements

Phase 1 (Current MVP)

Phase 2 (Next Steps)

Phase 3 (Advanced Features)

Troubleshooting

Common Issues

Neo4j Connection Failed

Celery Worker Not Processing

Ollama Models Not Found

Query Returns No Results

Support

Core Abstractions (`src/graph_rag_service/core/`)

Ingestion Pipeline (`src/graph_rag_service/ingestion/`)

Retrieval System (`src/graph_rag_service/retrieval/`)

API Layer (`src/graph_rag_service/api/`)

Workers (`src/graph_rag_service/workers/`)

Observability (`src/graph_rag_service/observability/`)