Spaces:

lablab-ai-amd-developer-hackathon
/

paperhawk

Running

Nándorfi Vince

Initial paperhawk push to HF Space (LFS for binaries)

7ff7119 4 days ago

6.66 kB

	"""LLM risk-analysis prompts and JSON schema (English).

	``RISK_SYSTEM_PROMPT`` is the full anti-hallucination prompt for the LLM:
	- 9 NORMAL examples (semantic guardrails)
	- 6 RISK examples (model calibration)
	- 4 CRITICAL RULES (empty list OK, concrete data refs, no fabrication, English/concise)

	The structured-output schema is mirrored by the ``LLMRiskResult`` Pydantic
	model in ``llm_risk_node.py``.
	"""

	from __future__ import annotations


	RISK_SYSTEM_PROMPT = """You are an audit risk analyst for business documents.
	Identify REAL anomalies and risks based on the data provided.

	THE MOST IMPORTANT RULE: when in doubt, do NOT flag anything. An empty list is best.

	=== CONCRETE EXAMPLES THAT ARE NOT RISKS (NEVER flag these) ===

	1. "Fulfillment date 2026-03-07 precedes issue date 2026-03-08 (1 day diff)"
	→ NORMAL. Standard B2B billing: fulfillment first, then invoice issued.
	1-30 day deltas are routine.

	2. "Payment due date 2026-04-07 (30 days after issue)"
	→ NORMAL. 30 days is the most common B2B payment term globally.
	8/14/15/30/45/60/90 day terms are ALL standard.

	3. "VAT 5%, 19%, 20%, 21%, 22%, 25%, 27%"
	→ NORMAL. Standard EU VAT rates. NEVER flag a single VAT rate
	as suspicious on its own.

	FACTS — do not get this wrong (it would be a logical contradiction):
	- 27% IS the standard Hungarian VAT rate for general goods/services
	(e.g. cleaning, IT services, accounting). NEVER say "27% is unusual"
	or "27% does not match the Hungarian standard" — that's a contradiction
	because that IS the Hungarian standard.
	- 5% HU reduced: medicine, books, periodicals, live performance
	- 18% HU reduced: basic food (milk, bread, meat, fish)
	- 0% (reverse charge): intra-community supply, EU export
	- 19% DE, 20% UK/AT, 21% NL/BE, 22% IT, 25% DK/SE — all standard EU
	- If the math checks out (net × rate = vat), that's GOOD, not a risk.

	4. "Delivery note has no amount field"
	→ NORMAL. Delivery notes are typically quantity-based, not amount-based.

	5. "Mathematically consistent invoice (net + VAT = gross)"
	→ GOOD, not a risk.

	6. "Standard company forms (Inc., LLC, Ltd., GmbH, B.V., SA, NV, Zrt., Kft.)"
	→ NORMAL.

	7. "Missing PO reference on the delivery note"
	→ NORMAL, not always required on a delivery note.

	8. "200% SLA penalty in an IT/SaaS service contract"
	→ NORMAL. In IT/SaaS service agreements, an SLA penalty of 200% (or similar)
	for service outages is INDUSTRY STANDARD. It ensures the customer is
	properly compensated for downtime. NEVER flag this on its own — only
	when it is disproportionate to the contract value (>30%, which the
	domain rule already catches) OR when combined with another red flag
	(e.g. unlimited liability + high penalty).

	9. "Currency conversion at typical mid-market rates"
	→ NORMAL.

	=== CONCRETE EXAMPLES THAT ARE RISKS ===

	1. "Net $541,500 + VAT $27,075 = $568,575, but the invoice shows $580,000 gross"
	→ HIGH: mathematical inconsistency.

	2. "Payment due 2026-03-01 is earlier than the issue date 2026-03-08 (backwards)"
	→ HIGH: reversed date logic.

	3. "March invoice $533,400 vs prior months at $355,600 for the same item (+50%)"
	→ HIGH: over-billing pattern in a package context.

	4. "Contract states: 'The mandator bears unlimited liability'"
	→ HIGH: legal red-flag clause.

	5. "The issuer's tax ID is missing from the invoice"
	→ HIGH: missing mandatory data.

	6. "Delivery note lists 48 units, but invoice shows 50 units of the same item"
	→ HIGH: quantity discrepancy (over-billing).

	=== CRITICAL RULES ===

	1. IF THERE IS NO REAL ANOMALY: return an EMPTY ``risks`` list (``[]``).
	Don't feel obligated to find something. An empty list = a clean document.

	2. Cite CONCRETE data points (number, field name, amount, date). Do not use
	vague phrases like "worth checking", "advisable to verify", "review at the
	source".

	3. NEVER fabricate data — work only from the JSON provided.

	4. English, concise. Avoid bureaucratic filler: "comprehensive", "thorough",
	"in-depth", "regulatory requirements", "recommended actions" — these are
	EMPTY filler words, do not use them."""


	def build_already_found_block(basic_risks: list[dict] \| None) -> str:
	"""Build the "ALREADY FOUND" block for the prompt.

	The user already sees the rule-based findings in another section. We tell
	the LLM not to repeat them in its own words — only to add genuinely new
	insights.
	"""
	if not basic_risks:
	return ""

	found_lines = []
	for r in basic_risks:
	# Read either EN ('description') or HU legacy ('leiras')
	desc = r.get("description") or r.get("leiras", "")
	if desc:
	found_lines.append(f" - {desc}")
	if not found_lines:
	return ""
	return (
	"\n\n=== THE RULE-BASED SYSTEM HAS ALREADY FOUND ===\n"
	+ "\n".join(found_lines)
	+ "\n\nIMPORTANT: The user ALREADY SEES these findings in another section. "
	"DO NOT repeat them in your own words — even from a different angle or "
	"with a different metaphor.\n\n"
	"EXAMPLES OF REPETITION (avoid):\n"
	" - If the rule-based system says: 'Quantity discrepancy: HI-100 invoice 40 vs delivery note 38'\n"
	" - Then THIS would be repetition: 'The invoice shows 40 units of HI-100 but \n"
	" the delivery note only 38 — this is a sign of over-billing'\n"
	" - Both say the same thing in different words.\n\n"
	"EXAMPLES OF NEW INSIGHTS (valuable):\n"
	" - 'The invoice is missing the issuer's postal address; only the tax ID is present'\n"
	" - 'The delivery note has a \\'replenishment by 2026-03-05\\' note, but \n"
	" the invoice still bills the full quantity including the missing portion'\n"
	" - 'The purchase order contains an \\'I-bracket\\' typo'\n"
	" - These are blind spots for the rule-based system, genuinely new info"
	)


	# The user prompt template — supplies the JSON-stringified extracted data.
	RISK_USER_PROMPT_TEMPLATE = """Analyze the document data below. Your task is specifically to identify risks and anomalies that a rule-based system CANNOT find:
	- missing mandatory fields (address, representative, etc.)
	- in-text contextual contradictions
	- unusual contractual provisions
	- cross-document textual inconsistencies (e.g. different names)

	Do NOT focus on mathematical inconsistencies or quantity mismatches — those are already covered by the rule-based system (see below).

	Document data:
	{data_str}{already_found}"""