Spaces:
Running
Running
π‘οΈ v4.1: Fix all critical bugs, security issues, and performance problems
#2
by gaurv007 - opened
ClauseGuard v4.1 β Comprehensive Bug Fix & Performance PR
π΄ CRITICAL Bug Fixes
- XSS sanitization corrupting contract text β Removed
text.replace(/</, "<")from analyze route that permanently mutated contracts before analysis - Unbounded memory leaks β
_chunk_cacheand_prediction_cachereplaced withBoundedCache(LRU, max 500/2000 entries) - Missing middleware.ts β Auth guard never executed; anyone could access dashboard without login
- NLI input format wrong β Changed from
[SEP]-concatenated string to proper(text_a, text_b)dict format for cross-encoder - Scan count race condition β Fixed table name mismatch (
analysis_historyvsanalyses) and now usesprofiles.analyses_this_month - RAG sessions never expired β Added TTL-based expiry (1 hour) for RAG sessions
π HIGH-Severity Fixes
- Hardcoded admin email removed β
ankygaur9972@gmail.comremoved from public schema.sql - Rate limiter improved β Sliding window with proper X-Forwarded-For IP extraction
- Input size validation β Added 200KB max text limit to prevent DoS
- Duplicate model loading eliminated β SentenceTransformer singleton in compare.py
π‘ MEDIUM-Severity Fixes
- Train/inference alignment β Changed from sigmoid (multi-label) to softmax (matching single-label training)
- Classifier max_length raised β 256β512 tokens (was truncating legal clauses)
- Risk score formula fixed β Now uses absolute risk (diminishing returns), not normalized by document length
- Compliance negation detection improved β Wider window (200 chars), sentence-boundary aware
- Regex fallback coverage expanded β Added 20+ missing CUAD categories (Audit Rights, Insurance, Source Code Escrow, etc.)
β‘ Performance Fixes
- O(nΒ²) comparison β O(n+m) β Pre-compute all embeddings once, use matrix multiplication
- Sequential NER β batched β Single pipeline call with batch_size=8 instead of per-chunk calls
- Gradio SSE polling improved β Exponential backoff (500msβ2s) instead of fixed 1s, increased timeout to 90s
- Loading skeleton added β
loading.tsxfor instant navigation feedback
Files Changed
app.pyβ Core analysis engine (all ML fixes)compare.pyβ O(nΒ²)βO(n) comparisoncompliance.pyβ Better negation detectionapi/main.pyβ Rate limiter, RAG session TTL, input validationweb/middleware.tsβ NEW: Auth guard (was missing entirely)web/app/api/analyze/route.tsβ XSS fix, scan count fix, input validationweb/app/api/chat/route.tsβ Proper session handling documentationweb/app/dashboard-pages/analyze/loading.tsxβ NEW: Loading skeletonweb/lib/supabase/schema.sqlβ Removed hardcoded admin email
gaurv007 changed pull request status to open
gaurv007 changed pull request status to merged