JaydeepR Claude Sonnet 4.6 commited on
Commit
b2ad034
·
1 Parent(s): c7224df

Step 12: UI tabs — full bidder evaluation, review queue, audit log, overview

Browse files

Implements specs/13_ui_tabs.md. Tab 1: KPI cards + demo CTA. Tab 2: criteria
extraction (Step 6). Tab 3: multi-bidder evaluation with verdict pills,
confidence bars, OCR-tier badges, and expandable details. Tab 4: human review
queue with Approve/Edit/Reject. Tab 5: sortable audit log with CSV export.
Sidebar: live DeepSeek connection probe with green/amber/red status dot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (7) hide show
  1. app.py +31 -2
  2. specs/13_ui_tabs.md +121 -0
  3. ui/components.py +33 -1
  4. ui/tab_audit.py +51 -1
  5. ui/tab_bidders.py +110 -1
  6. ui/tab_overview.py +68 -2
  7. ui/tab_review.py +95 -1
app.py CHANGED
@@ -12,13 +12,42 @@ st.set_page_config(
12
  layout="wide",
13
  )
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  # ── Sidebar ──────────────────────────────────────────────────────────────────
16
  with st.sidebar:
17
  st.markdown("## ⚖️ TenderIQ")
18
  st.caption("Explainable AI for Tender Evaluation")
19
  st.divider()
20
- # Connection status — placeholder until core/llm_client.py is wired
21
- st.markdown("🔴 **DeepSeek:** not connected")
 
 
 
 
 
 
 
 
 
22
  st.divider()
23
  if st.button("Reset Session", use_container_width=True):
24
  for key in list(st.session_state.keys()):
 
12
  layout="wide",
13
  )
14
 
15
+
16
+ def _probe_llm() -> str:
17
+ """Returns 'green', 'amber', or 'red'."""
18
+ if st.session_state.get("fallback_active"):
19
+ return "amber"
20
+ if "llm_status" in st.session_state:
21
+ return st.session_state["llm_status"]
22
+ from core.llm_client import LLM, LLMUnavailable
23
+ try:
24
+ LLM().chat_json("Respond with valid JSON only.", '{"ping": true}')
25
+ st.session_state["llm_status"] = "green"
26
+ return "green"
27
+ except LLMUnavailable:
28
+ st.session_state["llm_status"] = "red"
29
+ return "red"
30
+ except Exception:
31
+ st.session_state["llm_status"] = "red"
32
+ return "red"
33
+
34
+
35
  # ── Sidebar ──────────────────────────────────────────────────────────────────
36
  with st.sidebar:
37
  st.markdown("## ⚖️ TenderIQ")
38
  st.caption("Explainable AI for Tender Evaluation")
39
  st.divider()
40
+
41
+ status = _probe_llm()
42
+ if status == "green":
43
+ st.markdown("🟢 **DeepSeek:** connected")
44
+ elif status == "amber":
45
+ st.markdown("🟡 **DeepSeek:** pre-computed mode")
46
+ st.warning("⚠ Pre-computed results active.")
47
+ else:
48
+ st.markdown("🔴 **DeepSeek:** not connected")
49
+ st.caption("Using pre-computed fallback data.")
50
+
51
  st.divider()
52
  if st.button("Reset Session", use_container_width=True):
53
  for key in list(st.session_state.keys()):
specs/13_ui_tabs.md ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Spec 13 — UI Tabs
2
+
3
+ **Step:** 12 of 15
4
+ **Time budget:** ~80 min total
5
+
6
+ ---
7
+
8
+ ## Goal
9
+
10
+ Implement all five Streamlit tabs and `ui/components.py`. The app must render the full demo flow without an API key (using precomputed data), and with one (calling the live LLM).
11
+
12
+ ---
13
+
14
+ ## `ui/components.py` — Shared widgets
15
+
16
+ ### `verdict_pill(verdict: str) -> str`
17
+ Returns a markdown-formatted colored badge string:
18
+ - `eligible` → `":green[✅ Eligible]"`
19
+ - `not_eligible` → `":red[❌ Not Eligible]"`
20
+ - `needs_review` → `":orange[⚠ Needs Review]"`
21
+
22
+ ### `confidence_bar(value: float, label: str = "Confidence") -> None`
23
+ Renders `st.progress(value, text=f"{label}: {value:.0%}")`.
24
+
25
+ ### `ocr_tier_badge(source_type: str) -> str`
26
+ Returns a short badge string:
27
+ - `text_pdf` → "`📄 text_pdf`"
28
+ - `tesseract` → "`🔍 tesseract`"
29
+ - `vision_llm` → "`👁 vision_llm`"
30
+
31
+ ### `category_badge(category: str) -> str`
32
+ Returns `":blue[financial]"`, `":green[technical]"`, or `":orange[compliance]"`.
33
+
34
+ ---
35
+
36
+ ## Tab 1 — Overview (`ui/tab_overview.py`)
37
+
38
+ Layout:
39
+ 1. Hero text + tagline.
40
+ 2. Two-column KPI cards: Criteria Extracted, Bidders Evaluated, Mandatory Criteria Checked, Audit Entries Logged.
41
+ 3. Architecture summary (text description since no image file yet).
42
+ 4. "Use Pre-loaded Demo Data" CTA that sets `st.session_state["use_demo"] = True` and shows the criteria count from the fallback file.
43
+
44
+ KPI values: count from `st.session_state` data and `audit.query()`.
45
+
46
+ ---
47
+
48
+ ## Tab 2 — Tender Analysis (`ui/tab_tender.py`)
49
+
50
+ Already implemented in Step 6. No changes needed beyond what's there.
51
+
52
+ ---
53
+
54
+ ## Tab 3 — Bidder Evaluation (`ui/tab_bidders.py`)
55
+
56
+ Layout:
57
+ 1. `st.header("Bidder Evaluation")`
58
+ 2. Multi-select for bidders: `["bidder_a", "bidder_b", "bidder_c"]`, default all.
59
+ 3. Button **"Run Evaluation"** (type=primary).
60
+ 4. On click:
61
+ a. Ensure criteria are loaded (from session_state or fallback).
62
+ b. For each selected bidder: `process_bidder(...)`, then `evaluate_bidder(...)`.
63
+ c. Store verdicts in `st.session_state["verdicts"]` as `{bidder_id: [Verdict.model_dump(), ...]}`.
64
+ 5. If verdicts in session:
65
+ - For each bidder: show per-bidder summary header.
66
+ - Show a table of criteria rows using `st.columns`.
67
+ - Each row: criterion title, verdict pill, extracted value, source chip (doc + page), OCR-tier badge, confidence bar.
68
+ - Expandable "Reason" and "Source Snippet" per row.
69
+
70
+ Per-bidder summary: count eligible/not_eligible/needs_review among mandatory criteria. Overall: Eligible only if all mandatory are eligible; Not Eligible if any are not_eligible; Needs Review otherwise.
71
+
72
+ ---
73
+
74
+ ## Tab 4 — Human Review Queue (`ui/tab_review.py`)
75
+
76
+ Layout:
77
+ 1. `st.header("Human Review Queue")`
78
+ 2. Shows all verdicts where `review_status == "pending"` AND `verdict == "needs_review"`.
79
+ 3. For each such verdict:
80
+ - Show: bidder_id, criterion title, extracted value, confidence, reason, source snippet.
81
+ - Three buttons: **Approve**, **Edit & Approve**, **Reject**.
82
+ - **Approve**: set `review_status = "approved"`, log `human_review_action` to audit.
83
+ - **Edit & Approve**: show `st.text_input` for edited value, set `review_status = "edited"`, log audit.
84
+ - **Reject**: set `review_status = "rejected"`, log audit.
85
+ 4. If no pending items: `st.success("No items pending review.")`.
86
+
87
+ State: verdicts stored in `st.session_state["verdicts"]` as nested dicts. Updates write back to the same structure.
88
+
89
+ ---
90
+
91
+ ## Tab 5 — Audit Log (`ui/tab_audit.py`)
92
+
93
+ Layout:
94
+ 1. `st.header("Audit Log")`
95
+ 2. Filter row: bidder dropdown, action dropdown, date range.
96
+ 3. Table: `st.dataframe` with columns: ts, action, actor, bidder_id, criterion_id, payload_json.
97
+ 4. **"Export CSV"** button: `st.download_button` with CSV data from filtered rows.
98
+
99
+ ---
100
+
101
+ ## Sidebar update (`app.py`)
102
+
103
+ Replace the hardcoded "🔴 **DeepSeek:** not connected" with a live probe:
104
+ - Try `LLM().chat_json("ping", '{"ping": true}')` at startup (cached with session_state).
105
+ - Green: live and no fallback fired.
106
+ - Amber: fallback has fired this session.
107
+ - Red: probe failed.
108
+
109
+ If `st.session_state.get("fallback_active")`: show `st.sidebar.warning("⚠ Pre-computed mode active.")`.
110
+
111
+ ---
112
+
113
+ ## Acceptance Criteria
114
+
115
+ 1. Tab 1 renders without error and shows KPI cards.
116
+ 2. Tab 3 "Run Evaluation" populates the verdict table for all 3 bidders.
117
+ 3. Bidder A shows all mandatory criteria eligible. Bidder B shows C1 not_eligible.
118
+ 4. Tab 4 shows at least one pending review item for Bidder C.
119
+ 5. Tab 4 Approve button updates `review_status` and adds an audit entry.
120
+ 6. Tab 5 shows audit entries and CSV download works.
121
+ 7. Sidebar connection dot is green/amber/red based on API availability.
ui/components.py CHANGED
@@ -1 +1,33 @@
1
- # Shared UI widgets — implemented incrementally as Tab 3 and Tab 4 need them.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+
3
+
4
+ def verdict_pill(verdict: str) -> str:
5
+ if verdict == "eligible":
6
+ return ":green[✅ Eligible]"
7
+ elif verdict == "not_eligible":
8
+ return ":red[❌ Not Eligible]"
9
+ else:
10
+ return ":orange[⚠ Needs Review]"
11
+
12
+
13
+ def confidence_bar(value: float, label: str = "Confidence") -> None:
14
+ st.progress(min(max(value, 0.0), 1.0), text=f"{label}: {value:.0%}")
15
+
16
+
17
+ def ocr_tier_badge(source_type: str) -> str:
18
+ icons = {
19
+ "text_pdf": "📄 text_pdf",
20
+ "tesseract": "🔍 tesseract",
21
+ "vision_llm": "👁 vision_llm",
22
+ }
23
+ return icons.get(source_type, f"❓ {source_type}")
24
+
25
+
26
+ def category_badge(category: str) -> str:
27
+ if category == "financial":
28
+ return ":blue[financial]"
29
+ elif category == "technical":
30
+ return ":green[technical]"
31
+ elif category == "compliance":
32
+ return ":orange[compliance]"
33
+ return category
ui/tab_audit.py CHANGED
@@ -1,5 +1,55 @@
 
 
 
1
  import streamlit as st
2
 
 
 
 
3
  def render() -> None:
4
  st.header("Audit Log")
5
- st.info("Coming soon — sortable audit log with CSV export.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import io
2
+
3
+ import pandas as pd
4
  import streamlit as st
5
 
6
+ from core import audit
7
+
8
+
9
  def render() -> None:
10
  st.header("Audit Log")
11
+
12
+ col1, col2, col3 = st.columns(3)
13
+ with col1:
14
+ bidder_filter = st.selectbox(
15
+ "Filter by bidder",
16
+ options=["All", "bidder_a", "bidder_b", "bidder_c"],
17
+ )
18
+ with col2:
19
+ action_filter = st.selectbox(
20
+ "Filter by action",
21
+ options=["All", "criteria_extracted", "bidder_processed", "criterion_evaluated",
22
+ "human_review_action", "precomputed_fallback_used", "vision_ocr_invoked"],
23
+ )
24
+ with col3:
25
+ st.markdown("&nbsp;") # spacer
26
+
27
+ filters: dict = {}
28
+ if bidder_filter != "All":
29
+ filters["bidder_id"] = bidder_filter
30
+ if action_filter != "All":
31
+ filters["action"] = action_filter
32
+
33
+ rows = audit.query(filters if filters else None)
34
+
35
+ if not rows:
36
+ st.info("No audit entries yet. Run an evaluation to generate entries.")
37
+ return
38
+
39
+ df = pd.DataFrame(rows)
40
+ display_cols = ["id", "ts", "action", "actor", "bidder_id", "criterion_id", "payload_json"]
41
+ display_cols = [c for c in display_cols if c in df.columns]
42
+ df_display = df[display_cols].copy()
43
+ df_display["ts"] = df_display["ts"].str[:19].str.replace("T", " ")
44
+
45
+ st.markdown(f"**{len(rows)} entries** (newest first)")
46
+ st.dataframe(df_display, use_container_width=True, hide_index=True)
47
+
48
+ csv_buf = io.StringIO()
49
+ df_display.to_csv(csv_buf, index=False)
50
+ st.download_button(
51
+ label="Export CSV",
52
+ data=csv_buf.getvalue().encode("utf-8"),
53
+ file_name="tenderiq_audit_log.csv",
54
+ mime="text/csv",
55
+ )
ui/tab_bidders.py CHANGED
@@ -1,5 +1,114 @@
 
 
1
  import streamlit as st
2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  def render() -> None:
4
  st.header("Bidder Evaluation")
5
- st.info("Coming soon — per-bidder, per-criterion verdict table.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+
3
  import streamlit as st
4
 
5
+ from core import bidder_processor, evaluator
6
+ from core.config import DATA_DIR
7
+ from core.fallback import load_criteria
8
+ from core.schemas import Criterion
9
+ from ui.components import category_badge, confidence_bar, ocr_tier_badge, verdict_pill
10
+
11
+ _BIDDER_LABELS = {
12
+ "bidder_a": "Bidder A — Apex Constructions (Clearly Eligible)",
13
+ "bidder_b": "Bidder B — BuildRight Enterprises (Ineligible: Low Turnover)",
14
+ "bidder_c": "Bidder C — Shree Constructions (Scanned Cert: Needs Review)",
15
+ }
16
+
17
+
18
+ def _get_criteria() -> list[Criterion]:
19
+ data = st.session_state.get("criteria")
20
+ if data:
21
+ return [Criterion(**c) for c in data]
22
+ return load_criteria()
23
+
24
+
25
+ def _overall_verdict(verdicts: list[dict]) -> str:
26
+ mandatory = [v for v in verdicts if True] # all criteria checked
27
+ if any(v["verdict"] == "not_eligible" for v in mandatory):
28
+ return "not_eligible"
29
+ if any(v["verdict"] == "needs_review" for v in mandatory):
30
+ return "needs_review"
31
+ return "eligible"
32
+
33
+
34
  def render() -> None:
35
  st.header("Bidder Evaluation")
36
+
37
+ selected = st.multiselect(
38
+ "Select bidders to evaluate",
39
+ options=["bidder_a", "bidder_b", "bidder_c"],
40
+ default=["bidder_a", "bidder_b", "bidder_c"],
41
+ format_func=lambda x: _BIDDER_LABELS.get(x, x),
42
+ )
43
+
44
+ if st.button("Run Evaluation", type="primary"):
45
+ criteria = _get_criteria()
46
+ verdicts_dict: dict = {}
47
+ progress = st.progress(0, text="Starting evaluation…")
48
+ total = len(selected) * len(criteria)
49
+ done = 0
50
+ for bidder_id in selected:
51
+ files = sorted(
52
+ f for f in (DATA_DIR / "bidders" / bidder_id).iterdir()
53
+ if f.suffix.lower() in {".pdf", ".png", ".jpg"}
54
+ )
55
+ with st.spinner(f"Processing {bidder_id} documents…"):
56
+ bidder_processor.process_bidder(bidder_id, files)
57
+ verdicts_list = []
58
+ for c in criteria:
59
+ v = evaluator.evaluate(bidder_id, c)
60
+ verdicts_list.append(v.model_dump())
61
+ done += 1
62
+ progress.progress(done / total, text=f"Evaluated {c.id} for {bidder_id}")
63
+ verdicts_dict[bidder_id] = verdicts_list
64
+ st.session_state["verdicts"] = verdicts_dict
65
+ progress.empty()
66
+ st.success("Evaluation complete.")
67
+ st.rerun()
68
+
69
+ verdicts_data = st.session_state.get("verdicts", {})
70
+ criteria = _get_criteria()
71
+ crit_map = {c.id: c for c in criteria}
72
+
73
+ if st.session_state.get("fallback_active"):
74
+ st.warning("⚠ Live API unavailable — showing pre-computed results.")
75
+
76
+ for bidder_id in (selected or list(verdicts_data.keys())):
77
+ if bidder_id not in verdicts_data:
78
+ continue
79
+ verdicts = verdicts_data[bidder_id]
80
+ overall = _overall_verdict(verdicts)
81
+ overall_pill = verdict_pill(overall)
82
+
83
+ with st.expander(
84
+ f"**{_BIDDER_LABELS.get(bidder_id, bidder_id)}** — Overall: {overall_pill}",
85
+ expanded=True,
86
+ ):
87
+ for v in verdicts:
88
+ crit = crit_map.get(v["criterion_id"])
89
+ crit_title = crit.title if crit else v["criterion_id"]
90
+ mandatory_tag = "🔴" if (crit and crit.mandatory) else "🟡"
91
+ cat = category_badge(crit.category if crit else "compliance")
92
+
93
+ cols = st.columns([3, 2, 2, 2, 1])
94
+ cols[0].markdown(f"{mandatory_tag} **{v['criterion_id']}** {crit_title}")
95
+ cols[1].markdown(verdict_pill(v["verdict"]))
96
+ cols[2].markdown(f"{v.get('extracted_value') or '—'}")
97
+ if v.get("source"):
98
+ src = v["source"]
99
+ tier = ocr_tier_badge(src["source_type"])
100
+ cols[3].markdown(f"`{src['doc_name']}` p{src['page']} {tier}")
101
+ else:
102
+ cols[3].markdown("—")
103
+ cols[4].markdown(cat)
104
+
105
+ conf = v.get("combined_confidence", 0.0)
106
+ confidence_bar(conf)
107
+
108
+ if v.get("reason") or v.get("source"):
109
+ with st.expander("Details", expanded=False):
110
+ if v.get("reason"):
111
+ st.markdown(f"**Reason:** {v['reason']}")
112
+ if v.get("source") and v["source"].get("snippet"):
113
+ st.markdown(f"**Source snippet:** _{v['source']['snippet']}_")
114
+ st.divider()
ui/tab_overview.py CHANGED
@@ -1,5 +1,71 @@
1
  import streamlit as st
2
 
 
 
 
 
3
  def render() -> None:
4
- st.header("Overview")
5
- st.info("Coming soon — architecture diagram, KPIs, and demo CTA.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import streamlit as st
2
 
3
+ from core import audit
4
+ from core.fallback import load_criteria
5
+
6
+
7
  def render() -> None:
8
+ st.header("⚖️ TenderIQ — Explainable AI for Tender Evaluation")
9
+ st.markdown(
10
+ "Automated eligibility evaluation of bidders against government tender criteria, "
11
+ "with criterion-level explainability, OCR for scanned documents, and a complete audit trail."
12
+ )
13
+ st.divider()
14
+
15
+ # KPI cards
16
+ criteria_count = len(st.session_state.get("criteria", load_criteria()))
17
+ verdicts = st.session_state.get("verdicts", {})
18
+ bidders_evaluated = len(verdicts)
19
+ mandatory_checked = sum(
20
+ 1 for bv in verdicts.values() for v in bv
21
+ if v.get("verdict") in ("eligible", "not_eligible", "needs_review")
22
+ )
23
+ audit_entries = len(audit.query())
24
+
25
+ c1, c2, c3, c4 = st.columns(4)
26
+ c1.metric("Criteria Extracted", criteria_count)
27
+ c2.metric("Bidders Evaluated", bidders_evaluated)
28
+ c3.metric("Criteria Checked", mandatory_checked)
29
+ c4.metric("Audit Entries", audit_entries)
30
+
31
+ st.divider()
32
+
33
+ st.subheader("How it works")
34
+ col_a, col_b = st.columns(2)
35
+ with col_a:
36
+ st.markdown("""
37
+ **Stage 1 — Extract Criteria**
38
+ DeepSeek LLM reads the tender PDF and extracts each eligibility criterion as structured JSON (category, rule, query hints).
39
+
40
+ **Stage 2 — OCR & Index Bidder Docs**
41
+ Three-tier OCR: PyMuPDF (typed PDF) → Tesseract → DeepSeek Vision LLM (low-confidence scans). All pages indexed into ChromaDB.
42
+ """)
43
+ with col_b:
44
+ st.markdown("""
45
+ **Stage 3 — Evaluate per Criterion**
46
+ Vector search retrieves relevant evidence chunks. DeepSeek evaluates eligible / not_eligible / needs_review with a combined confidence score.
47
+
48
+ **Stage 4 — Human Review & Audit**
49
+ Low-confidence verdicts are routed to the review queue. Every action is logged with timestamp, model version, and payload.
50
+ """)
51
+
52
+ st.divider()
53
+
54
+ st.subheader("Quick Start")
55
+ col1, col2 = st.columns(2)
56
+ with col1:
57
+ if st.button("Load Pre-computed Demo", type="primary", use_container_width=True):
58
+ from core.fallback import load_criteria as lc, load_evaluation
59
+ criteria = lc()
60
+ st.session_state["criteria"] = [c.model_dump() for c in criteria]
61
+ verdicts_dict: dict = {}
62
+ for bidder_id in ["bidder_a", "bidder_b", "bidder_c"]:
63
+ verdicts_dict[bidder_id] = [
64
+ load_evaluation(bidder_id, c.id).model_dump()
65
+ for c in criteria
66
+ ]
67
+ st.session_state["verdicts"] = verdicts_dict
68
+ st.success("Pre-computed demo data loaded. Navigate to the other tabs.")
69
+ st.rerun()
70
+ with col2:
71
+ st.info("Or go to **Tender Analysis** tab to run the live LLM pipeline.")
ui/tab_review.py CHANGED
@@ -1,5 +1,99 @@
1
  import streamlit as st
2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  def render() -> None:
4
  st.header("Human Review Queue")
5
- st.info("Coming soon — approve / edit / reject flagged verdicts.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import streamlit as st
2
 
3
+ from core import audit
4
+ from core.fallback import load_criteria
5
+ from core.schemas import Criterion
6
+ from ui.components import confidence_bar, verdict_pill
7
+
8
+
9
+ def _get_criteria_map() -> dict[str, Criterion]:
10
+ data = st.session_state.get("criteria")
11
+ if data:
12
+ return {c["id"]: Criterion(**c) for c in data}
13
+ return {c.id: c for c in load_criteria()}
14
+
15
+
16
  def render() -> None:
17
  st.header("Human Review Queue")
18
+
19
+ verdicts_data: dict = st.session_state.get("verdicts", {})
20
+ if not verdicts_data:
21
+ st.info("No evaluation results yet. Run the evaluation in the Bidder Evaluation tab first.")
22
+ return
23
+
24
+ crit_map = _get_criteria_map()
25
+ pending_items = []
26
+ for bidder_id, verdicts in verdicts_data.items():
27
+ for i, v in enumerate(verdicts):
28
+ if v.get("verdict") == "needs_review" and v.get("review_status", "pending") == "pending":
29
+ pending_items.append((bidder_id, i, v))
30
+
31
+ if not pending_items:
32
+ st.success("No items pending review. All flagged verdicts have been actioned.")
33
+ return
34
+
35
+ st.markdown(f"**{len(pending_items)} item(s) pending review**")
36
+ st.divider()
37
+
38
+ for bidder_id, idx, v in pending_items:
39
+ crit = crit_map.get(v["criterion_id"])
40
+ crit_title = crit.title if crit else v["criterion_id"]
41
+
42
+ with st.container(border=True):
43
+ col1, col2 = st.columns([3, 1])
44
+ with col1:
45
+ st.markdown(f"**{bidder_id}** — {v['criterion_id']}: {crit_title}")
46
+ st.markdown(f"Verdict: {verdict_pill(v['verdict'])}")
47
+ if v.get("extracted_value"):
48
+ st.markdown(f"Extracted value: `{v['extracted_value']}`")
49
+ if v.get("reason"):
50
+ st.markdown(f"Reason: _{v['reason']}_")
51
+ if v.get("source") and v["source"].get("snippet"):
52
+ st.markdown(f"Source snippet: _{v['source']['snippet']}_")
53
+ with col2:
54
+ conf = v.get("combined_confidence", 0.0)
55
+ confidence_bar(conf, "Confidence")
56
+
57
+ btn_col1, btn_col2, btn_col3 = st.columns(3)
58
+ key_prefix = f"review_{bidder_id}_{v['criterion_id']}"
59
+
60
+ with btn_col1:
61
+ if st.button("✅ Approve", key=f"{key_prefix}_approve", use_container_width=True):
62
+ st.session_state["verdicts"][bidder_id][idx]["review_status"] = "approved"
63
+ audit.log(
64
+ "human_review_action",
65
+ actor="officer",
66
+ bidder_id=bidder_id,
67
+ criterion_id=v["criterion_id"],
68
+ action_taken="approved",
69
+ )
70
+ st.rerun()
71
+
72
+ with btn_col2:
73
+ edit_val = st.text_input("Edited value", key=f"{key_prefix}_edit_val",
74
+ placeholder="Enter corrected value…")
75
+ if st.button("✏ Edit & Approve", key=f"{key_prefix}_edit", use_container_width=True):
76
+ st.session_state["verdicts"][bidder_id][idx]["review_status"] = "edited"
77
+ if edit_val:
78
+ st.session_state["verdicts"][bidder_id][idx]["extracted_value"] = edit_val
79
+ audit.log(
80
+ "human_review_action",
81
+ actor="officer",
82
+ bidder_id=bidder_id,
83
+ criterion_id=v["criterion_id"],
84
+ action_taken="edited",
85
+ edited_value=edit_val,
86
+ )
87
+ st.rerun()
88
+
89
+ with btn_col3:
90
+ if st.button("❌ Reject", key=f"{key_prefix}_reject", use_container_width=True):
91
+ st.session_state["verdicts"][bidder_id][idx]["review_status"] = "rejected"
92
+ audit.log(
93
+ "human_review_action",
94
+ actor="officer",
95
+ bidder_id=bidder_id,
96
+ criterion_id=v["criterion_id"],
97
+ action_taken="rejected",
98
+ )
99
+ st.rerun()