civicsetu / README.md
adeshboudh16
updated docs
8de7198
metadata
title: CivicSetu
emoji: πŸ›οΈ
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: default
app_file: app.py
pinned: false

CivicSetu

Live: https://civicsetu-two.vercel.app

Open-source RAG system for querying Indian civic and legal documents β€” with accurate citations, cross-reference traversal, and conflict detection between laws.

Current status: Phase 9 complete β€” 5-jurisdiction RERA coverage, RAGAS evaluation pipeline (0.90 faithfulness), hybrid RRF retrieval, and mobile-responsive Next.js frontend live on Vercel.


What it does

Ask a plain-English question about RERA. Get a cited, structured answer with section references, confidence score, and a legal disclaimer β€” grounded in real legal text.


Query:  "Which state rules implement section 9 of RERA on agent registration?"

Answer: "Section 9 of the RERA Act 2016 governs agent registration at the central level.
Rule 11 of Maharashtra Rules 2017 and Rule 8 of Karnataka RERA Rules derive
from Section 9, specifying application procedures and timelines..."

Citations: [Section 9, RERA Act 2016], [Rule 11, Maharashtra Rules 2017],
[Rule 8, Karnataka RERA Rules]
Confidence: 0.96 (high)

Architecture


FastAPI β†’ LangGraph Agent β†’ pgvector + Neo4j + PostgreSQL
↑
Ingestion Pipeline (PDF β†’ chunks β†’ embeddings β†’ graph)

Three stores per query:

  • pgvector β€” semantic similarity (fact lookups)
  • Neo4j β€” section graph traversal (cross-references, DERIVED_FROM edges)
  • PostgreSQL β€” full chunk text + metadata

Full design: HLD.md | LLD.md | RAG.md


Quickstart

Prerequisites

  • Docker + Docker Compose
  • uv package manager
  • One of: Gemini API key (free tier) or Groq API key (free tier)

No Ollama required. Embeddings run locally via sentence-transformers. First run downloads nomic-embed-text-v1.5 (~550MB) from HuggingFace and caches it.

Setup

# 1. Clone and install
git clone https://github.com/adeshboudh/civicsetu.git && cd civicsetu
make install

# 2. Configure secrets
cp .env.example .env
# Set GEMINI_API_KEY and/or GROQ_API_KEY β€” everything else has working defaults

# 3. Start infrastructure
make docker-up

# 4. Ingest all 5 jurisdictions
make ingest

# 5. Start the API
make serve

Full docs: HLD | LLD

Production

  • Frontend: Vercel β€” Next.js 15 App Router (Mobile Responsive)
  • API: Hugging Face Spaces β€” FastAPI + Docker + 550MB model baked in
  • PostgreSQL + pgvector: Neon β€” 1203 chunks
  • Neo4j: AuraDB Free β€” 2090 sections, 2321 edges
  • LLM: LiteLLM (Gemini β†’ Groq β†’ OpenRouter)

6. Query

curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What are the penalties for a promoter who delays possession?"}'

First request will be slow (~30–45s) while the embedding model loads into memory. Subsequent requests run at 5–15s.

Other useful commands

make e2e        # Run 12-case E2E benchmark across all 5 jurisdictions
make test       # Run unit tests
make lint       # Ruff linter
make typecheck  # mypy

# RAGAS evaluation
make eval-smoke-p1   # Phase 1: invoke graph for 5-row smoke dataset
make eval-smoke-p2   # Phase 2: score cached results with RAGAS
make eval-p1         # Phase 1: full 31-row golden dataset
make eval-p2         # Phase 2: score all 31 rows
make eval-reset      # Clear eval caches (re-runs everything)

make ingest --jurisdiction MAHARASHTRA  # Re-ingest a single jurisdiction
make docker-down                        # Tear down containers

Documents ingested

Document Jurisdiction Sections
RERA Act 2016 Central 224
Maharashtra Real Estate Rules 2017 Maharashtra 214
UP RERA Rules 2016 Uttar Pradesh 170
UP RERA General Regulations 2019 Uttar Pradesh 85
Karnataka RERA Rules 2017 Karnataka 235
Tamil Nadu RERA Rules 2017 Tamil Nadu 157

Total chunks: 1203. Graph: 2090 Section nodes, 1297 HAS_SECTION edges, 933 REFERENCES edges, 91 DERIVED_FROM edges.


Tech stack

Layer Technology
API FastAPI + Uvicorn
Orchestration LangGraph StateGraph
LLM routing LiteLLM (Gemini β†’ Groq β†’ OpenRouter)
Embeddings nomic-embed-text-v1.5 via sentence-transformers (local, no Ollama required)
Vector DB pgvector + HNSW index
Graph DB Neo4j Community
Relational PostgreSQL + SQLAlchemy
Retrieval Hybrid RRF: pgvector cosine + PostgreSQL FTS (websearch_to_tsquery OR-mode)
Reranker FlashRank (rank-T5-flan) + score gap filter
Evaluation RAGAS (faithfulness, answer relevancy, context precision)
PDF parsing PyMuPDF

Phase roadmap

Phase Scope Status
0 RERA Act 2016, vector RAG, FastAPI βœ… Complete
1 Neo4j graph, cross-reference queries βœ… Complete
2 MahaRERA Rules 2017, multi-jurisdiction βœ… Complete
3 DERIVED_FROM edges, cross-jurisdiction graph βœ… Complete
4 Multi-state expansion (UP, TN, Karnataka) βœ… Complete
5 Agent pipeline hardening, E2E test suite βœ… Complete
6 Next.js frontend, Vercel deployment, public URL βœ… Complete
7 Graph explorer, section content drawer, D3 visualization βœ… Complete
8 RAGAS eval pipeline, hybrid RRF retrieval, retrieval quality fixes βœ… Complete
9 Mobile responsiveness, frontend polish, dual-pane layout, interaction animations βœ… Complete

ADRs

Disclaimer

CivicSetu provides AI-generated legal information, not legal advice. Always verify with a qualified lawyer or the official gazette.