Apply for community grant: Academic project (gpu and storage)

#1
by guru1805 - opened

Project Overview

Kanoon is an open-source legal search engine designed to democratize access to Indian Supreme Court judgments using state-of-the-art NLP and vector search. Unlike commercial platforms, Kanoon enables natural language queries, semantic case retrieval, and interactive case analysis for lawyers, students, and citizens — completely free.


Technical Details

Core Technologies

  • AI/ML Stack:
    • Embeddings: all-MiniLM-L6-v2 (Hugging Face Transformers) + Google Gemini
    • Vector DB: Hybrid Qdrant (cloud) + FAISS (local)
    • RAG: LangChain + Gemini-2.0-flash for case-specific Q&A
  • Backend: FastAPI, Qdrant, Firebase Auth
  • Frontend: React, TailwindCSS, PDF.js
  • Deployment: Hugging Face Spaces (Dockerized)

Key Features

  1. Semantic Search: 28% higher precision than keyword-based systems.
  2. Case Chat: Ask contextual questions about judgments (e.g., "Explain the court’s reasoning on Article 21").
  3. Open-Source Pipeline: Full code for scraping → preprocessing → embedding → search.
  4. Accessibility: Dark/light themes, mobile-responsive UI.

Innovation & Impact

Why Kanoon Stands Out

  • Solves Critical Gaps: Addresses India’s legal research barriers (cost, complexity, keyword reliance).
  • Hybrid Architecture: Balances scalability (Qdrant) and speed (FAISS) for low-latency responses.
  • Transparency: All models/data pipelines openly documented and reproducible.

Metrics (Beta Launch)

  • Precision@5: 82% (vs 64% for traditional systems)
  • User Feedback: 4.4/5 satisfaction from legal professionals
  • Data Processed: 924 Supreme Court judgments (2024)

Hugging Face Ecosystem Alignment

  • Models Used: sentence-transformers/all-MiniLM-L6-v2, Google Gemini via langchain-google-genai.
  • Spaces Integration:
    • Fixed caching/permissions for model loading in Spaces.
    • Optimized Dockerfile for GPU-less deployment.
  • Community Value:
    • First open-source legal RAG system for Indian case law.
    • Template for multilingual legal search in developing nations.

Long-Term Vision

  • Become the default legal search tool for Indian law students/NGOs.
  • Expand to High Courts and regional languages (Hindi, Tamil, Bengali).
  • Partner with legal aid organizations to bridge the justice gap.

Why Support Kanoon?

  • Open Source: Code, data, and models are publicly accessible.
  • Scalable Impact: 1.3B+ people in India lack affordable legal research tools.
  • HF Community Growth: Demonstrates real-world NLP for social good.

Sign up or log in to comment