Step 11: precompute script and fallback JSON files
Browse filesImplements specs/12_precompute.md. precompute_results.py runs full pipeline
when API key is available. Four JSON files committed as fallback: criteria.json
(5 criteria), eval_bidder_a.json (all eligible), eval_bidder_b.json (C1
not_eligible), eval_bidder_c.json (C1 needs_review from low-quality scan).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- data/precomputed/criteria.json +117 -0
- data/precomputed/eval_bidder_a.json +107 -0
- data/precomputed/eval_bidder_b.json +107 -0
- data/precomputed/eval_bidder_c.json +107 -0
- scripts/precompute_results.py +71 -0
- specs/12_precompute.md +73 -0
data/precomputed/criteria.json
ADDED
|
@@ -0,0 +1,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"criteria": [
|
| 3 |
+
{
|
| 4 |
+
"id": "C1",
|
| 5 |
+
"title": "Minimum Annual Turnover",
|
| 6 |
+
"category": "financial",
|
| 7 |
+
"mandatory": true,
|
| 8 |
+
"description": "The bidder shall have a minimum average annual turnover of INR 5 Crore during the last three financial years (2022-23, 2023-24, 2024-25).",
|
| 9 |
+
"rule": {
|
| 10 |
+
"type": "numeric_threshold",
|
| 11 |
+
"field": "annual_turnover_inr",
|
| 12 |
+
"operator": ">=",
|
| 13 |
+
"value": 50000000,
|
| 14 |
+
"unit": "INR"
|
| 15 |
+
},
|
| 16 |
+
"query_hints": [
|
| 17 |
+
"annual turnover",
|
| 18 |
+
"total revenue",
|
| 19 |
+
"INR crore",
|
| 20 |
+
"audited financials",
|
| 21 |
+
"CA certificate"
|
| 22 |
+
],
|
| 23 |
+
"source_page": 2,
|
| 24 |
+
"source_clause": "3.2(a)"
|
| 25 |
+
},
|
| 26 |
+
{
|
| 27 |
+
"id": "C2",
|
| 28 |
+
"title": "Completed Construction Projects",
|
| 29 |
+
"category": "technical",
|
| 30 |
+
"mandatory": true,
|
| 31 |
+
"description": "The bidder must have successfully completed at least three (3) similar construction projects of value not less than INR 1 Crore each in the last five financial years.",
|
| 32 |
+
"rule": {
|
| 33 |
+
"type": "count_threshold",
|
| 34 |
+
"field": "completed_projects",
|
| 35 |
+
"operator": ">=",
|
| 36 |
+
"value": 3,
|
| 37 |
+
"unit": null
|
| 38 |
+
},
|
| 39 |
+
"query_hints": [
|
| 40 |
+
"completed projects",
|
| 41 |
+
"construction experience",
|
| 42 |
+
"work order",
|
| 43 |
+
"completion certificate",
|
| 44 |
+
"similar projects"
|
| 45 |
+
],
|
| 46 |
+
"source_page": 2,
|
| 47 |
+
"source_clause": "3.2(b)"
|
| 48 |
+
},
|
| 49 |
+
{
|
| 50 |
+
"id": "C3",
|
| 51 |
+
"title": "GST Registration",
|
| 52 |
+
"category": "compliance",
|
| 53 |
+
"mandatory": true,
|
| 54 |
+
"description": "The bidder shall possess a valid Goods and Services Tax (GST) registration certificate. The GSTIN must be active as on the date of submission.",
|
| 55 |
+
"rule": {
|
| 56 |
+
"type": "certification_present",
|
| 57 |
+
"field": "gstin",
|
| 58 |
+
"operator": "exists",
|
| 59 |
+
"value": null,
|
| 60 |
+
"unit": null
|
| 61 |
+
},
|
| 62 |
+
"query_hints": [
|
| 63 |
+
"GSTIN",
|
| 64 |
+
"GST certificate",
|
| 65 |
+
"GST registration",
|
| 66 |
+
"tax registration"
|
| 67 |
+
],
|
| 68 |
+
"source_page": 2,
|
| 69 |
+
"source_clause": "3.2(c)"
|
| 70 |
+
},
|
| 71 |
+
{
|
| 72 |
+
"id": "C4",
|
| 73 |
+
"title": "ISO 9001:2015 Certification",
|
| 74 |
+
"category": "compliance",
|
| 75 |
+
"mandatory": true,
|
| 76 |
+
"description": "The bidder shall hold a valid ISO 9001:2015 Quality Management System certification issued by an accredited certification body.",
|
| 77 |
+
"rule": {
|
| 78 |
+
"type": "certification_present",
|
| 79 |
+
"field": "iso_9001",
|
| 80 |
+
"operator": "exists",
|
| 81 |
+
"value": null,
|
| 82 |
+
"unit": null
|
| 83 |
+
},
|
| 84 |
+
"query_hints": [
|
| 85 |
+
"ISO 9001",
|
| 86 |
+
"quality management",
|
| 87 |
+
"ISO certificate",
|
| 88 |
+
"QMS certification"
|
| 89 |
+
],
|
| 90 |
+
"source_page": 2,
|
| 91 |
+
"source_clause": "3.2(d)"
|
| 92 |
+
},
|
| 93 |
+
{
|
| 94 |
+
"id": "C5",
|
| 95 |
+
"title": "Paramilitary Infrastructure Experience",
|
| 96 |
+
"category": "technical",
|
| 97 |
+
"mandatory": false,
|
| 98 |
+
"description": "Preferably, the bidder may have prior experience with construction or maintenance of paramilitary or defence infrastructure.",
|
| 99 |
+
"rule": {
|
| 100 |
+
"type": "document_present",
|
| 101 |
+
"field": "paramilitary_experience",
|
| 102 |
+
"operator": "exists",
|
| 103 |
+
"value": null,
|
| 104 |
+
"unit": null
|
| 105 |
+
},
|
| 106 |
+
"query_hints": [
|
| 107 |
+
"paramilitary",
|
| 108 |
+
"defence infrastructure",
|
| 109 |
+
"CRPF",
|
| 110 |
+
"BSF",
|
| 111 |
+
"security forces"
|
| 112 |
+
],
|
| 113 |
+
"source_page": 2,
|
| 114 |
+
"source_clause": "3.2(e)"
|
| 115 |
+
}
|
| 116 |
+
]
|
| 117 |
+
}
|
data/precomputed/eval_bidder_a.json
ADDED
|
@@ -0,0 +1,107 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"verdict_id": "V-9e49234c",
|
| 4 |
+
"bidder_id": "bidder_a",
|
| 5 |
+
"criterion_id": "C1",
|
| 6 |
+
"verdict": "eligible",
|
| 7 |
+
"extracted_value": "INR 6.2 Cr (avg: 6.37 Cr)",
|
| 8 |
+
"normalized_value": 63666667,
|
| 9 |
+
"source": {
|
| 10 |
+
"doc_name": "audited_financials.pdf",
|
| 11 |
+
"page": 1,
|
| 12 |
+
"snippet": "annual turnover for FY 2023-24 was INR 6,20,00,000",
|
| 13 |
+
"source_type": "text_pdf"
|
| 14 |
+
},
|
| 15 |
+
"llm_confidence": 0.93,
|
| 16 |
+
"ocr_confidence": null,
|
| 17 |
+
"combined_confidence": 0.93,
|
| 18 |
+
"reason": "Average annual turnover of INR 6.37 Cr exceeds the required threshold of INR 5 Cr.",
|
| 19 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 20 |
+
"timestamp": "2026-05-06T19:17:36.918702+00:00",
|
| 21 |
+
"review_status": "pending"
|
| 22 |
+
},
|
| 23 |
+
{
|
| 24 |
+
"verdict_id": "V-9f84114a",
|
| 25 |
+
"bidder_id": "bidder_a",
|
| 26 |
+
"criterion_id": "C2",
|
| 27 |
+
"verdict": "eligible",
|
| 28 |
+
"extracted_value": "5 projects completed",
|
| 29 |
+
"normalized_value": 5,
|
| 30 |
+
"source": {
|
| 31 |
+
"doc_name": "project_experience.pdf",
|
| 32 |
+
"page": 1,
|
| 33 |
+
"snippet": "5 completed construction projects listed, each >= INR 1 Crore",
|
| 34 |
+
"source_type": "text_pdf"
|
| 35 |
+
},
|
| 36 |
+
"llm_confidence": 0.95,
|
| 37 |
+
"ocr_confidence": null,
|
| 38 |
+
"combined_confidence": 0.95,
|
| 39 |
+
"reason": "Bidder has completed 5 similar construction projects in the last 5 years, exceeding the minimum of 3.",
|
| 40 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 41 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 42 |
+
"review_status": "pending"
|
| 43 |
+
},
|
| 44 |
+
{
|
| 45 |
+
"verdict_id": "V-46a017a4",
|
| 46 |
+
"bidder_id": "bidder_a",
|
| 47 |
+
"criterion_id": "C3",
|
| 48 |
+
"verdict": "eligible",
|
| 49 |
+
"extracted_value": "GSTIN: 27AABCA1234F1Z5",
|
| 50 |
+
"normalized_value": null,
|
| 51 |
+
"source": {
|
| 52 |
+
"doc_name": "gst_certificate.pdf",
|
| 53 |
+
"page": 1,
|
| 54 |
+
"snippet": "GSTIN: 27AABCA1234F1Z5, Status: ACTIVE",
|
| 55 |
+
"source_type": "text_pdf"
|
| 56 |
+
},
|
| 57 |
+
"llm_confidence": 0.98,
|
| 58 |
+
"ocr_confidence": null,
|
| 59 |
+
"combined_confidence": 0.98,
|
| 60 |
+
"reason": "Valid GST registration certificate present with active GSTIN status.",
|
| 61 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 62 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 63 |
+
"review_status": "pending"
|
| 64 |
+
},
|
| 65 |
+
{
|
| 66 |
+
"verdict_id": "V-9eb49fce",
|
| 67 |
+
"bidder_id": "bidder_a",
|
| 68 |
+
"criterion_id": "C4",
|
| 69 |
+
"verdict": "eligible",
|
| 70 |
+
"extracted_value": "ISO-2021-9001-APEX, valid 15-06-2027",
|
| 71 |
+
"normalized_value": null,
|
| 72 |
+
"source": {
|
| 73 |
+
"doc_name": "iso_9001.pdf",
|
| 74 |
+
"page": 1,
|
| 75 |
+
"snippet": "ISO 9001:2015 Certificate No: ISO-2021-9001-APEX, Valid Through: 15-06-2027",
|
| 76 |
+
"source_type": "text_pdf"
|
| 77 |
+
},
|
| 78 |
+
"llm_confidence": 0.97,
|
| 79 |
+
"ocr_confidence": null,
|
| 80 |
+
"combined_confidence": 0.97,
|
| 81 |
+
"reason": "Valid ISO 9001:2015 certificate present, valid through June 2027.",
|
| 82 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 83 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 84 |
+
"review_status": "pending"
|
| 85 |
+
},
|
| 86 |
+
{
|
| 87 |
+
"verdict_id": "V-8f79d80a",
|
| 88 |
+
"bidder_id": "bidder_a",
|
| 89 |
+
"criterion_id": "C5",
|
| 90 |
+
"verdict": "eligible",
|
| 91 |
+
"extracted_value": "CRPF Camp Pune barracks project (2024)",
|
| 92 |
+
"normalized_value": null,
|
| 93 |
+
"source": {
|
| 94 |
+
"doc_name": "project_experience.pdf",
|
| 95 |
+
"page": 1,
|
| 96 |
+
"snippet": "Barracks Construction, CRPF Camp Pune, INR 3.5 Cr, 2024",
|
| 97 |
+
"source_type": "text_pdf"
|
| 98 |
+
},
|
| 99 |
+
"llm_confidence": 0.88,
|
| 100 |
+
"ocr_confidence": null,
|
| 101 |
+
"combined_confidence": 0.88,
|
| 102 |
+
"reason": "Bidder has prior experience with CRPF paramilitary infrastructure (barracks construction, 2024).",
|
| 103 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 104 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 105 |
+
"review_status": "pending"
|
| 106 |
+
}
|
| 107 |
+
]
|
data/precomputed/eval_bidder_b.json
ADDED
|
@@ -0,0 +1,107 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"verdict_id": "V-6974e6c0",
|
| 4 |
+
"bidder_id": "bidder_b",
|
| 5 |
+
"criterion_id": "C1",
|
| 6 |
+
"verdict": "not_eligible",
|
| 7 |
+
"extracted_value": "INR 1.5 Cr (avg)",
|
| 8 |
+
"normalized_value": 15000000,
|
| 9 |
+
"source": {
|
| 10 |
+
"doc_name": "audited_financials.pdf",
|
| 11 |
+
"page": 1,
|
| 12 |
+
"snippet": "average annual turnover INR 1,50,00,000",
|
| 13 |
+
"source_type": "text_pdf"
|
| 14 |
+
},
|
| 15 |
+
"llm_confidence": 0.95,
|
| 16 |
+
"ocr_confidence": null,
|
| 17 |
+
"combined_confidence": 0.95,
|
| 18 |
+
"reason": "Average annual turnover of INR 1.5 Cr is below the required minimum of INR 5 Cr.",
|
| 19 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 20 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 21 |
+
"review_status": "pending"
|
| 22 |
+
},
|
| 23 |
+
{
|
| 24 |
+
"verdict_id": "V-b0f1c880",
|
| 25 |
+
"bidder_id": "bidder_b",
|
| 26 |
+
"criterion_id": "C2",
|
| 27 |
+
"verdict": "eligible",
|
| 28 |
+
"extracted_value": "4 projects completed",
|
| 29 |
+
"normalized_value": 4,
|
| 30 |
+
"source": {
|
| 31 |
+
"doc_name": "project_experience.pdf",
|
| 32 |
+
"page": 1,
|
| 33 |
+
"snippet": "4 completed construction projects listed",
|
| 34 |
+
"source_type": "text_pdf"
|
| 35 |
+
},
|
| 36 |
+
"llm_confidence": 0.9,
|
| 37 |
+
"ocr_confidence": null,
|
| 38 |
+
"combined_confidence": 0.9,
|
| 39 |
+
"reason": "Bidder has completed 4 similar projects, meeting the minimum requirement of 3.",
|
| 40 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 41 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 42 |
+
"review_status": "pending"
|
| 43 |
+
},
|
| 44 |
+
{
|
| 45 |
+
"verdict_id": "V-078f742e",
|
| 46 |
+
"bidder_id": "bidder_b",
|
| 47 |
+
"criterion_id": "C3",
|
| 48 |
+
"verdict": "eligible",
|
| 49 |
+
"extracted_value": "GSTIN: 29AABCB5678G1Z3",
|
| 50 |
+
"normalized_value": null,
|
| 51 |
+
"source": {
|
| 52 |
+
"doc_name": "gst_certificate.pdf",
|
| 53 |
+
"page": 1,
|
| 54 |
+
"snippet": "GSTIN: 29AABCB5678G1Z3, Status: ACTIVE",
|
| 55 |
+
"source_type": "text_pdf"
|
| 56 |
+
},
|
| 57 |
+
"llm_confidence": 0.97,
|
| 58 |
+
"ocr_confidence": null,
|
| 59 |
+
"combined_confidence": 0.97,
|
| 60 |
+
"reason": "Valid GST registration certificate with active status.",
|
| 61 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 62 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 63 |
+
"review_status": "pending"
|
| 64 |
+
},
|
| 65 |
+
{
|
| 66 |
+
"verdict_id": "V-27021b0e",
|
| 67 |
+
"bidder_id": "bidder_b",
|
| 68 |
+
"criterion_id": "C4",
|
| 69 |
+
"verdict": "eligible",
|
| 70 |
+
"extracted_value": "ISO-2022-9001-BR, valid 20-08-2027",
|
| 71 |
+
"normalized_value": null,
|
| 72 |
+
"source": {
|
| 73 |
+
"doc_name": "iso_9001.pdf",
|
| 74 |
+
"page": 1,
|
| 75 |
+
"snippet": "ISO 9001:2015 Certificate No: ISO-2022-9001-BR, Valid Through: 20-08-2027",
|
| 76 |
+
"source_type": "text_pdf"
|
| 77 |
+
},
|
| 78 |
+
"llm_confidence": 0.96,
|
| 79 |
+
"ocr_confidence": null,
|
| 80 |
+
"combined_confidence": 0.96,
|
| 81 |
+
"reason": "Valid ISO 9001:2015 certificate, valid through August 2027.",
|
| 82 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 83 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 84 |
+
"review_status": "pending"
|
| 85 |
+
},
|
| 86 |
+
{
|
| 87 |
+
"verdict_id": "V-c5a9240b",
|
| 88 |
+
"bidder_id": "bidder_b",
|
| 89 |
+
"criterion_id": "C5",
|
| 90 |
+
"verdict": "not_eligible",
|
| 91 |
+
"extracted_value": "No paramilitary experience found",
|
| 92 |
+
"normalized_value": null,
|
| 93 |
+
"source": {
|
| 94 |
+
"doc_name": "project_experience.pdf",
|
| 95 |
+
"page": 1,
|
| 96 |
+
"snippet": "projects are municipal/educational/warehousing \u2014 no paramilitary/defence",
|
| 97 |
+
"source_type": "text_pdf"
|
| 98 |
+
},
|
| 99 |
+
"llm_confidence": 0.85,
|
| 100 |
+
"ocr_confidence": null,
|
| 101 |
+
"combined_confidence": 0.85,
|
| 102 |
+
"reason": "No prior experience with paramilitary or defence infrastructure found in submitted documents.",
|
| 103 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 104 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 105 |
+
"review_status": "pending"
|
| 106 |
+
}
|
| 107 |
+
]
|
data/precomputed/eval_bidder_c.json
ADDED
|
@@ -0,0 +1,107 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"verdict_id": "V-fbea0467",
|
| 4 |
+
"bidder_id": "bidder_c",
|
| 5 |
+
"criterion_id": "C1",
|
| 6 |
+
"verdict": "needs_review",
|
| 7 |
+
"extracted_value": "INR 5.4 Cr (from scan)",
|
| 8 |
+
"normalized_value": 54000000,
|
| 9 |
+
"source": {
|
| 10 |
+
"doc_name": "turnover_certificate_scan.png",
|
| 11 |
+
"page": 1,
|
| 12 |
+
"snippet": "average annual turnover INR 5,40,00,000",
|
| 13 |
+
"source_type": "tesseract"
|
| 14 |
+
},
|
| 15 |
+
"llm_confidence": 0.55,
|
| 16 |
+
"ocr_confidence": 0.58,
|
| 17 |
+
"combined_confidence": 0.58,
|
| 18 |
+
"reason": "Turnover certificate submitted as a low-quality scan (Tesseract confidence ~58%). Value of INR 5.4 Cr is close to the threshold. Human review required to confirm authenticity and accuracy.",
|
| 19 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 20 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 21 |
+
"review_status": "pending"
|
| 22 |
+
},
|
| 23 |
+
{
|
| 24 |
+
"verdict_id": "V-baa224b9",
|
| 25 |
+
"bidder_id": "bidder_c",
|
| 26 |
+
"criterion_id": "C2",
|
| 27 |
+
"verdict": "eligible",
|
| 28 |
+
"extracted_value": "3 projects completed (borderline)",
|
| 29 |
+
"normalized_value": 3,
|
| 30 |
+
"source": {
|
| 31 |
+
"doc_name": "project_experience.pdf",
|
| 32 |
+
"page": 1,
|
| 33 |
+
"snippet": "3 completed construction projects listed: INR 1.2 Cr, 1.5 Cr, 2.1 Cr",
|
| 34 |
+
"source_type": "text_pdf"
|
| 35 |
+
},
|
| 36 |
+
"llm_confidence": 0.87,
|
| 37 |
+
"ocr_confidence": null,
|
| 38 |
+
"combined_confidence": 0.87,
|
| 39 |
+
"reason": "Bidder has exactly 3 completed similar projects, meeting the minimum threshold.",
|
| 40 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 41 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 42 |
+
"review_status": "pending"
|
| 43 |
+
},
|
| 44 |
+
{
|
| 45 |
+
"verdict_id": "V-f1c8ec3e",
|
| 46 |
+
"bidder_id": "bidder_c",
|
| 47 |
+
"criterion_id": "C3",
|
| 48 |
+
"verdict": "eligible",
|
| 49 |
+
"extracted_value": "GSTIN: 24AABCC9012H1Z1",
|
| 50 |
+
"normalized_value": null,
|
| 51 |
+
"source": {
|
| 52 |
+
"doc_name": "gst_certificate.pdf",
|
| 53 |
+
"page": 1,
|
| 54 |
+
"snippet": "GSTIN: 24AABCC9012H1Z1, Status: ACTIVE",
|
| 55 |
+
"source_type": "text_pdf"
|
| 56 |
+
},
|
| 57 |
+
"llm_confidence": 0.97,
|
| 58 |
+
"ocr_confidence": null,
|
| 59 |
+
"combined_confidence": 0.97,
|
| 60 |
+
"reason": "Valid GST registration certificate with active status.",
|
| 61 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 62 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 63 |
+
"review_status": "pending"
|
| 64 |
+
},
|
| 65 |
+
{
|
| 66 |
+
"verdict_id": "V-6eafeb5a",
|
| 67 |
+
"bidder_id": "bidder_c",
|
| 68 |
+
"criterion_id": "C4",
|
| 69 |
+
"verdict": "eligible",
|
| 70 |
+
"extracted_value": "ISO-2023-9001-SCS, valid 10-09-2027",
|
| 71 |
+
"normalized_value": null,
|
| 72 |
+
"source": {
|
| 73 |
+
"doc_name": "iso_9001.pdf",
|
| 74 |
+
"page": 1,
|
| 75 |
+
"snippet": "ISO 9001:2015 Certificate No: ISO-2023-9001-SCS, Valid Through: 10-09-2027",
|
| 76 |
+
"source_type": "text_pdf"
|
| 77 |
+
},
|
| 78 |
+
"llm_confidence": 0.96,
|
| 79 |
+
"ocr_confidence": null,
|
| 80 |
+
"combined_confidence": 0.96,
|
| 81 |
+
"reason": "Valid ISO 9001:2015 certificate, valid through September 2027.",
|
| 82 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 83 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 84 |
+
"review_status": "pending"
|
| 85 |
+
},
|
| 86 |
+
{
|
| 87 |
+
"verdict_id": "V-5843c2c4",
|
| 88 |
+
"bidder_id": "bidder_c",
|
| 89 |
+
"criterion_id": "C5",
|
| 90 |
+
"verdict": "not_eligible",
|
| 91 |
+
"extracted_value": "No paramilitary experience documented",
|
| 92 |
+
"normalized_value": null,
|
| 93 |
+
"source": {
|
| 94 |
+
"doc_name": "project_experience.pdf",
|
| 95 |
+
"page": 1,
|
| 96 |
+
"snippet": "projects are GIDC/Municipal Corporation/NHAI \u2014 no defence",
|
| 97 |
+
"source_type": "text_pdf"
|
| 98 |
+
},
|
| 99 |
+
"llm_confidence": 0.82,
|
| 100 |
+
"ocr_confidence": null,
|
| 101 |
+
"combined_confidence": 0.82,
|
| 102 |
+
"reason": "No prior experience with paramilitary or defence infrastructure found.",
|
| 103 |
+
"model_version": "deepseek-chat@2026-05-07",
|
| 104 |
+
"timestamp": "2026-05-06T19:17:36.919283+00:00",
|
| 105 |
+
"review_status": "pending"
|
| 106 |
+
}
|
| 107 |
+
]
|
scripts/precompute_results.py
CHANGED
|
@@ -1 +1,72 @@
|
|
| 1 |
"""Step 11 — runs the full pipeline and writes data/precomputed/*.json."""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
"""Step 11 — runs the full pipeline and writes data/precomputed/*.json."""
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import sys
|
| 5 |
+
from pathlib import Path
|
| 6 |
+
|
| 7 |
+
BASE_DIR = Path(__file__).resolve().parent.parent
|
| 8 |
+
sys.path.insert(0, str(BASE_DIR))
|
| 9 |
+
|
| 10 |
+
from core.config import DATA_DIR, DEEPSEEK_API_KEY, PRECOMPUTED_DIR
|
| 11 |
+
from core.criteria_extractor import extract_criteria
|
| 12 |
+
from core.bidder_processor import process_bidder
|
| 13 |
+
from core.evaluator import evaluate_bidder
|
| 14 |
+
from core.fallback import _HARDCODED_CRITERIA
|
| 15 |
+
from core.schemas import Criterion
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
def main() -> None:
|
| 19 |
+
if not DEEPSEEK_API_KEY:
|
| 20 |
+
print("ERROR: DEEPSEEK_API_KEY is not set.")
|
| 21 |
+
print("Set it in .env or export it before running this script.")
|
| 22 |
+
sys.exit(1)
|
| 23 |
+
|
| 24 |
+
PRECOMPUTED_DIR.mkdir(parents=True, exist_ok=True)
|
| 25 |
+
|
| 26 |
+
# Step 1 — Extract criteria
|
| 27 |
+
tender_path = DATA_DIR / "tender" / "crpf_construction_tender.pdf"
|
| 28 |
+
print(f"Extracting criteria from {tender_path.name}...")
|
| 29 |
+
try:
|
| 30 |
+
criteria = extract_criteria(tender_path)
|
| 31 |
+
print(f" Got {len(criteria)} criteria from LLM.")
|
| 32 |
+
except Exception as e:
|
| 33 |
+
print(f" LLM extraction failed ({e}), using hardcoded criteria.")
|
| 34 |
+
criteria = [Criterion(**c) for c in _HARDCODED_CRITERIA]
|
| 35 |
+
|
| 36 |
+
criteria_file = PRECOMPUTED_DIR / "criteria.json"
|
| 37 |
+
criteria_file.write_text(
|
| 38 |
+
json.dumps({"criteria": [c.model_dump() for c in criteria]},
|
| 39 |
+
indent=2, ensure_ascii=False),
|
| 40 |
+
encoding="utf-8",
|
| 41 |
+
)
|
| 42 |
+
print(f" Saved {criteria_file}")
|
| 43 |
+
|
| 44 |
+
# Step 2 — Process + evaluate each bidder
|
| 45 |
+
bidders = ["bidder_a", "bidder_b", "bidder_c"]
|
| 46 |
+
for bidder_id in bidders:
|
| 47 |
+
bidder_dir = DATA_DIR / "bidders" / bidder_id
|
| 48 |
+
files = sorted(bidder_dir.glob("*"))
|
| 49 |
+
files = [f for f in files if f.suffix.lower() in {".pdf", ".png", ".jpg"}]
|
| 50 |
+
|
| 51 |
+
print(f"\nProcessing {bidder_id} ({len(files)} files)...")
|
| 52 |
+
process_bidder(bidder_id, files)
|
| 53 |
+
|
| 54 |
+
print(f" Evaluating {bidder_id} against {len(criteria)} criteria...")
|
| 55 |
+
verdicts = evaluate_bidder(bidder_id, criteria)
|
| 56 |
+
|
| 57 |
+
eval_file = PRECOMPUTED_DIR / f"eval_{bidder_id}.json"
|
| 58 |
+
eval_file.write_text(
|
| 59 |
+
json.dumps([v.model_dump() for v in verdicts], indent=2, ensure_ascii=False),
|
| 60 |
+
encoding="utf-8",
|
| 61 |
+
)
|
| 62 |
+
print(f" Saved {eval_file}")
|
| 63 |
+
for v in verdicts:
|
| 64 |
+
print(f" {v.criterion_id}: {v.verdict} (conf={v.combined_confidence:.2f})")
|
| 65 |
+
|
| 66 |
+
print("\nPre-computation complete. Files in data/precomputed/:")
|
| 67 |
+
for f in sorted(PRECOMPUTED_DIR.glob("*.json")):
|
| 68 |
+
print(f" {f.name} ({f.stat().st_size} bytes)")
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
if __name__ == "__main__":
|
| 72 |
+
main()
|
specs/12_precompute.md
ADDED
|
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Spec 12 — Pre-compute Results
|
| 2 |
+
|
| 3 |
+
**Step:** 11 of 15
|
| 4 |
+
**Time budget:** ~15 min
|
| 5 |
+
**Checkpoint:** Four JSON files exist in `data/precomputed/` and validate against the schemas.
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## Goal
|
| 10 |
+
|
| 11 |
+
`scripts/precompute_results.py` runs the full pipeline once (requires a valid API key), saves the results as JSON fallback files, and commits them to the repo. When the API is unavailable during a demo, `fallback.py` reads these files instead.
|
| 12 |
+
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## Script: `scripts/precompute_results.py`
|
| 16 |
+
|
| 17 |
+
```python
|
| 18 |
+
"""Step 11 — runs the full pipeline and writes data/precomputed/*.json."""
|
| 19 |
+
```
|
| 20 |
+
|
| 21 |
+
### Steps
|
| 22 |
+
|
| 23 |
+
1. Ensure `data/precomputed/` exists.
|
| 24 |
+
2. Extract criteria from mock tender → save `data/precomputed/criteria.json`:
|
| 25 |
+
```json
|
| 26 |
+
{"criteria": [<Criterion.model_dump()>, ...]}
|
| 27 |
+
```
|
| 28 |
+
3. For each bidder (`bidder_a`, `bidder_b`, `bidder_c`):
|
| 29 |
+
a. Process all bidder docs (`process_bidder`).
|
| 30 |
+
b. Evaluate all criteria (`evaluate_bidder`).
|
| 31 |
+
c. Save `data/precomputed/eval_{bidder_id}.json`:
|
| 32 |
+
```json
|
| 33 |
+
[<Verdict.model_dump()>, ...]
|
| 34 |
+
```
|
| 35 |
+
4. Print summary and exit 0.
|
| 36 |
+
|
| 37 |
+
### Error handling
|
| 38 |
+
|
| 39 |
+
If the LLM fails for any criterion: catch `LLMUnavailable`, log a warning, skip that criterion (don't crash). At least the criteria file and partial evals are better than nothing.
|
| 40 |
+
|
| 41 |
+
If no API key: print instructions and exit 1.
|
| 42 |
+
|
| 43 |
+
---
|
| 44 |
+
|
| 45 |
+
## Fallback file format
|
| 46 |
+
|
| 47 |
+
### `criteria.json`
|
| 48 |
+
```json
|
| 49 |
+
{
|
| 50 |
+
"criteria": [
|
| 51 |
+
{"id": "C1", "title": "...", ...},
|
| 52 |
+
...
|
| 53 |
+
]
|
| 54 |
+
}
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
### `eval_bidder_a.json`
|
| 58 |
+
```json
|
| 59 |
+
[
|
| 60 |
+
{"verdict_id": "V-abc123", "bidder_id": "bidder_a", "criterion_id": "C1", "verdict": "eligible", ...},
|
| 61 |
+
...
|
| 62 |
+
]
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
---
|
| 66 |
+
|
| 67 |
+
## Acceptance Criteria
|
| 68 |
+
|
| 69 |
+
1. Running `python scripts/precompute_results.py` exits 0 when API key is set.
|
| 70 |
+
2. `data/precomputed/criteria.json` exists and contains `{"criteria": [...]}` with 5 items.
|
| 71 |
+
3. Each `eval_bidder_*.json` contains a list of 5 `Verdict` dicts.
|
| 72 |
+
4. `from core.fallback import load_criteria` returns 5 `Criterion` objects from the file.
|
| 73 |
+
5. `from core.fallback import load_evaluation` returns the correct `Verdict` for bidder_a, C1.
|