Spaces:

mahammadaftab
/

CivicAI

Sleeping

App Files Files Community

CivicAI / PROBLEM_STATEMENT.md

mahammadaftab

Final updated

6298125 11 days ago

preview code

raw

history blame contribute delete

8.85 kB

	# CivicAI — Real-World Problem Statement

	## Problem Definition

	> AI-driven societal policy optimization under uncertainty

	Modern governments face a combinatorial decision-making problem: thousands of
	interdependent policy levers (taxes, healthcare spending, education, policing,
	subsidies, emergency responses) interact through complex causal chains to
	produce emergent societal outcomes across economic, public-health, and social
	cohesion dimensions — often with weeks-to-years of lag and high uncertainty.

	No human decision-maker can simultaneously optimise all dimensions. AI agents
	trained in CivicAI learn to:

	1. Observe rich societal state (12+ indicators)
	2. Act across a continuous multi-dimensional policy space
	3. Receive delayed, multi-objective feedback
	4. Adapt to unexpected shocks (pandemics, market crashes, social unrest)

	---

	## Real-World Domain Mapping

	\| CivicAI dimension \| Real-world counterpart \| Real data anchor \|
	\|---\|---\|---\|
	\| `gdp`, `gdp_growth`, `inflation` \| Macroeconomic fiscal policy \| World Bank GDP / IMF inflation data \|
	\| `employment_rate` \| Labour market policy \| ILO unemployment statistics \|
	\| `tax_rate`, `budget_balance` \| Government revenue & deficit \| OECD fiscal balance data \|
	\| `health_index`, `infection_rate` \| Public-health capacity & epidemics \| WHO health expenditure / GHI \|
	\| `crime_rate` \| Rule-of-law & public safety \| UNODC crime indices \|
	\| `public_satisfaction` \| Democratic legitimacy / approval \| Edelman Trust Barometer \|
	\| `emergent.wealth_inequality` \| Distributional equity \| Gini coefficient (World Bank) \|
	\| `emergent.social_unrest` \| Political stability \| World Governance Indicators \|
	\| `food_reserves`, `energy_reserves` \| Strategic resource security \| FAO / IEA stockpile data \|
	\| `education_quality` \| Human capital investment \| UNESCO / PISA \|

	### Domain 1 — Governance (Fiscal Policy)

	Real-world problem: Governments must set tax rates that raise revenue
	without suppressing growth, and allocate budgets across competing public goods
	(healthcare vs. education vs. security) while maintaining fiscal sustainability.

	CivicAI mapping:
	- Action: `tax_rate` ∈ [0, 1], `healthcare_budget`, `education_budget`, `police_budget`
	- State: `gdp`, `inflation`, `employment_rate`, `budget_balance`
	- Challenge: High taxes → GDP drag; low taxes → deficit spiral

	### Domain 2 — Economy (Macroeconomic Stabilisation)

	Real-world problem: Recessions require countercyclical stimulus, but
	overspending triggers inflation. Optimal fiscal multipliers depend on the
	current economic regime.

	CivicAI mapping:
	- Action: `subsidy_policy` ∈ {none, agriculture, industry, technology}
	- State: `gdp_growth`, `inflation`, `employment_rate`
	- Challenge: Technology subsidies boost long-run growth but worsen near-term
	inequality; agriculture subsidies improve food security but reduce GDP growth

	### Domain 3 — Public Health (Epidemic Management)

	Real-world problem: Pandemics create tradeoffs between infection
	suppression (via lockdowns) and economic activity. Optimal policies depend on
	medical supply capacity, infection dynamics, and public compliance.

	CivicAI mapping:
	- Action: `healthcare_budget`, `emergency_response` (lockdown / stimulus / open)
	- State: `infection_rate`, `health_index`, `medical_supplies`, `gdp`
	- Challenge: Lockdown reduces infection but crushes GDP; premature opening
	causes epidemic rebound

	### Domain 4 — Social Cohesion (Crisis Management)

	Real-world problem: Compound crises (unemployment + crime + inequality +
	unrest) exhibit non-linear cascade dynamics: once social unrest exceeds a
	threshold, even good economic data fails to restore stability.

	CivicAI mapping:
	- Action: All levers simultaneously; no single dominant strategy
	- State: `public_satisfaction`, `crime_rate`, `emergent.wealth_inequality`,
	`emergent.social_unrest`
	- Challenge: Inequality is a slow-moving structural variable; quick fixes
	(police budget) address symptoms, not causes

	---

	## Tasks

	### Task 1 — Economic Stability `[EASY]`

	Objective: Restore a mild recession economy to fiscal stability.

	\| Criterion \| Target \| Failure \|
	\|---\|---\|---\|
	\| Inflation \| < 6% \| ≥ 15% \|
	\| Employment \| > 85% \| ≤ 65% \|
	\| GDP \| > $400B \| ≤ $250B \|
	\| Budget Balance \| Surplus preferred \| ≤ −30% deficit \|

	Initial conditions: GDP $450B, inflation 7%, employment 82%, satisfaction 55%

	Deterministic grader (`EconomicStabilityGrader`):

	```
	score = 0.40 × inflation_score
	+ 0.40 × employment_score
	+ 0.10 × gdp_score
	+ 0.10 × budget_score

	inflation_score = linear_inv(inflation, ideal=3%, fail=15%)
	× 0.40 if hyperinflation (>20%)
	employment_score = linear(employment_rate, fail=65%, ideal=90%)
	gdp_score = linear(gdp, fail=$250B, ideal=$500B)
	budget_score = linear(budget_balance, fail=−30%, ideal=0%)

	All linear() / linear_inv() produce values in [0.0, 1.0].
	No random calls. Always deterministic.
	```

	Success threshold: score ≥ 0.75

	---

	### Task 2 — Pandemic Management `[MEDIUM]`

	Objective: Suppress a 20% infection-rate epidemic without destroying the
	economy.

	\| Criterion \| Target \| Failure \|
	\|---\|---\|---\|
	\| Infection rate \| < 10% \| ≥ 30% \|
	\| Health index \| > 0.60 \| ≤ 0.30 \|
	\| GDP \| > $300B \| ≤ $200B \|
	\| Medical supplies \| > 0.60 \| ≤ 0.20 \|

	Initial conditions: Infection 20%, health index 0.55, GDP $480B, medical supplies 0.50

	Deterministic grader (`PandemicManagementGrader`):

	```
	score = 0.40 × infection_score
	+ 0.30 × health_score
	+ 0.20 × gdp_score
	+ 0.10 × supplies_score

	infection_score = linear_inv(infection_rate, ideal=2%, fail=30%)
	× 0.50 if epidemic OOC (≥40%)
	health_score = linear(health_index, fail=0.30, ideal=0.80)
	gdp_score = linear(gdp, fail=$200B, ideal=$480B)
	supplies_score = linear(medical_supplies, fail=0.20, ideal=0.80)

	No random calls. Always deterministic.
	```

	Core tension: Lockdown ↑ infection_score but ↓ gdp_score — agent must
	find the optimal tradeoff trajectory.

	Success threshold: score ≥ 0.75

	---

	### Task 3 — Social Stability Crisis `[HARD]`

	Objective: Restore social order from a compound multi-domain crisis with
	cascading failure risk.

	\| Criterion \| Target \| Failure \|
	\|---\|---\|---\|
	\| Public satisfaction \| > 50% \| ≤ 15% \|
	\| Crime rate \| < 12% \| ≥ 35% \|
	\| Employment rate \| > 80% \| ≤ 55% \|
	\| Wealth inequality (Gini) \| < 0.40 \| ≥ 0.70 \|

	Initial conditions: Employment 68%, crime 25%, satisfaction 30%, Gini 0.55, social unrest 0.45

	Deterministic grader (`SocialCrisisGrader`):

	```
	score = 0.30 × satisfaction_score
	+ 0.25 × crime_score
	+ 0.25 × employment_score
	+ 0.20 × inequality_score
	× 0.60 if social_unrest > 0.65 (cascade penalty)

	satisfaction_score = linear(public_satisfaction, fail=0.15, ideal=0.70)
	crime_score = linear_inv(crime_rate, ideal=5%, fail=35%)
	× 0.50 if crime_rate ≥ 40%
	employment_score = linear(employment_rate, fail=55%, ideal=88%)
	inequality_score = linear_inv(gini, ideal=0.20, fail=0.70)

	No random calls. Always deterministic.
	```

	Why it's hard:
	- Gini is structural — requires sustained tax redistribution over many turns
	- Social unrest cascade multiplier punishes instability even when individual
	metrics improve
	- No single dominant strategy; agents must balance all four dimensions
	simultaneously

	Success threshold: score ≥ 0.75

	---

	## Grader API

	```python
	from civicai.graders import grade, GradeResult

	result: GradeResult = grade(state, task_id="stabilize_economy")

	print(result.score) # float ∈ [0.0, 1.0]
	print(result.success) # bool: True if score ≥ 0.75
	print(result.summary) # human-readable verdict
	print(result.to_dict()) # full component breakdown (JSON-serializable)
	```

	Every `env.step()` call returns this grade in `info["task_grade"]`:

	```python
	obs, reward, done, info = env.step(action)
	grade_result = info["task_grade"] # dict: {score, success, components, ...}
	```

	---

	## Why This Is Non-Trivial

	\| Challenge \| Description \|
	\|---\|---\|
	\| Multi-objective \| 5 rubric dimensions + task-specific grader — no single scalar fully captures the objective \|
	\| Long-horizon \| 50-turn episodes; many actions have 5–10 turn lag before effects appear \|
	\| Non-linear dynamics \| Social unrest cascade, hyperinflation multiplier, epidemic OOC penalty \|
	\| Structural vs. tactical \| Gini responds slowly to redistribution; crime responds quickly to policing \|
	\| Real-world data \| GDP growth, inflation, unemployment, life expectancy anchored to World Bank baseline \|
	\| Emergent behaviour \| Wealth inequality → unrest → protest → GDP drag (3-step causal chain) \|