# CivicAI — Real-World Problem Statement ## Problem Definition > **AI-driven societal policy optimization under uncertainty** Modern governments face a combinatorial decision-making problem: thousands of interdependent policy levers (taxes, healthcare spending, education, policing, subsidies, emergency responses) interact through complex causal chains to produce emergent societal outcomes across economic, public-health, and social cohesion dimensions — often with weeks-to-years of lag and high uncertainty. No human decision-maker can simultaneously optimise all dimensions. AI agents trained in CivicAI learn to: 1. Observe rich societal state (12+ indicators) 2. Act across a continuous multi-dimensional policy space 3. Receive delayed, multi-objective feedback 4. Adapt to unexpected shocks (pandemics, market crashes, social unrest) --- ## Real-World Domain Mapping | CivicAI dimension | Real-world counterpart | Real data anchor | |---|---|---| | `gdp`, `gdp_growth`, `inflation` | Macroeconomic fiscal policy | World Bank GDP / IMF inflation data | | `employment_rate` | Labour market policy | ILO unemployment statistics | | `tax_rate`, `budget_balance` | Government revenue & deficit | OECD fiscal balance data | | `health_index`, `infection_rate` | Public-health capacity & epidemics | WHO health expenditure / GHI | | `crime_rate` | Rule-of-law & public safety | UNODC crime indices | | `public_satisfaction` | Democratic legitimacy / approval | Edelman Trust Barometer | | `emergent.wealth_inequality` | Distributional equity | Gini coefficient (World Bank) | | `emergent.social_unrest` | Political stability | World Governance Indicators | | `food_reserves`, `energy_reserves` | Strategic resource security | FAO / IEA stockpile data | | `education_quality` | Human capital investment | UNESCO / PISA | ### Domain 1 — Governance (Fiscal Policy) **Real-world problem:** Governments must set tax rates that raise revenue without suppressing growth, and allocate budgets across competing public goods (healthcare vs. education vs. security) while maintaining fiscal sustainability. **CivicAI mapping:** - Action: `tax_rate` ∈ [0, 1], `healthcare_budget`, `education_budget`, `police_budget` - State: `gdp`, `inflation`, `employment_rate`, `budget_balance` - Challenge: High taxes → GDP drag; low taxes → deficit spiral ### Domain 2 — Economy (Macroeconomic Stabilisation) **Real-world problem:** Recessions require countercyclical stimulus, but overspending triggers inflation. Optimal fiscal multipliers depend on the current economic regime. **CivicAI mapping:** - Action: `subsidy_policy` ∈ {none, agriculture, industry, technology} - State: `gdp_growth`, `inflation`, `employment_rate` - Challenge: Technology subsidies boost long-run growth but worsen near-term inequality; agriculture subsidies improve food security but reduce GDP growth ### Domain 3 — Public Health (Epidemic Management) **Real-world problem:** Pandemics create tradeoffs between infection suppression (via lockdowns) and economic activity. Optimal policies depend on medical supply capacity, infection dynamics, and public compliance. **CivicAI mapping:** - Action: `healthcare_budget`, `emergency_response` (lockdown / stimulus / open) - State: `infection_rate`, `health_index`, `medical_supplies`, `gdp` - Challenge: Lockdown reduces infection but crushes GDP; premature opening causes epidemic rebound ### Domain 4 — Social Cohesion (Crisis Management) **Real-world problem:** Compound crises (unemployment + crime + inequality + unrest) exhibit non-linear cascade dynamics: once social unrest exceeds a threshold, even good economic data fails to restore stability. **CivicAI mapping:** - Action: All levers simultaneously; no single dominant strategy - State: `public_satisfaction`, `crime_rate`, `emergent.wealth_inequality`, `emergent.social_unrest` - Challenge: Inequality is a slow-moving structural variable; quick fixes (police budget) address symptoms, not causes --- ## Tasks ### Task 1 — Economic Stability `[EASY]` **Objective:** Restore a mild recession economy to fiscal stability. | Criterion | Target | Failure | |---|---|---| | Inflation | < 6% | ≥ 15% | | Employment | > 85% | ≤ 65% | | GDP | > $400B | ≤ $250B | | Budget Balance | Surplus preferred | ≤ −30% deficit | **Initial conditions:** GDP $450B, inflation 7%, employment 82%, satisfaction 55% **Deterministic grader** (`EconomicStabilityGrader`): ``` score = 0.40 × inflation_score + 0.40 × employment_score + 0.10 × gdp_score + 0.10 × budget_score inflation_score = linear_inv(inflation, ideal=3%, fail=15%) × 0.40 if hyperinflation (>20%) employment_score = linear(employment_rate, fail=65%, ideal=90%) gdp_score = linear(gdp, fail=$250B, ideal=$500B) budget_score = linear(budget_balance, fail=−30%, ideal=0%) All linear() / linear_inv() produce values in [0.0, 1.0]. No random calls. Always deterministic. ``` **Success threshold:** score ≥ 0.75 --- ### Task 2 — Pandemic Management `[MEDIUM]` **Objective:** Suppress a 20% infection-rate epidemic without destroying the economy. | Criterion | Target | Failure | |---|---|---| | Infection rate | < 10% | ≥ 30% | | Health index | > 0.60 | ≤ 0.30 | | GDP | > $300B | ≤ $200B | | Medical supplies | > 0.60 | ≤ 0.20 | **Initial conditions:** Infection 20%, health index 0.55, GDP $480B, medical supplies 0.50 **Deterministic grader** (`PandemicManagementGrader`): ``` score = 0.40 × infection_score + 0.30 × health_score + 0.20 × gdp_score + 0.10 × supplies_score infection_score = linear_inv(infection_rate, ideal=2%, fail=30%) × 0.50 if epidemic OOC (≥40%) health_score = linear(health_index, fail=0.30, ideal=0.80) gdp_score = linear(gdp, fail=$200B, ideal=$480B) supplies_score = linear(medical_supplies, fail=0.20, ideal=0.80) No random calls. Always deterministic. ``` **Core tension:** Lockdown ↑ infection_score but ↓ gdp_score — agent must find the optimal tradeoff trajectory. **Success threshold:** score ≥ 0.75 --- ### Task 3 — Social Stability Crisis `[HARD]` **Objective:** Restore social order from a compound multi-domain crisis with cascading failure risk. | Criterion | Target | Failure | |---|---|---| | Public satisfaction | > 50% | ≤ 15% | | Crime rate | < 12% | ≥ 35% | | Employment rate | > 80% | ≤ 55% | | Wealth inequality (Gini) | < 0.40 | ≥ 0.70 | **Initial conditions:** Employment 68%, crime 25%, satisfaction 30%, Gini 0.55, social unrest 0.45 **Deterministic grader** (`SocialCrisisGrader`): ``` score = 0.30 × satisfaction_score + 0.25 × crime_score + 0.25 × employment_score + 0.20 × inequality_score × 0.60 if social_unrest > 0.65 (cascade penalty) satisfaction_score = linear(public_satisfaction, fail=0.15, ideal=0.70) crime_score = linear_inv(crime_rate, ideal=5%, fail=35%) × 0.50 if crime_rate ≥ 40% employment_score = linear(employment_rate, fail=55%, ideal=88%) inequality_score = linear_inv(gini, ideal=0.20, fail=0.70) No random calls. Always deterministic. ``` **Why it's hard:** - Gini is structural — requires sustained tax redistribution over many turns - Social unrest cascade multiplier punishes instability even when individual metrics improve - No single dominant strategy; agents must balance all four dimensions simultaneously **Success threshold:** score ≥ 0.75 --- ## Grader API ```python from civicai.graders import grade, GradeResult result: GradeResult = grade(state, task_id="stabilize_economy") print(result.score) # float ∈ [0.0, 1.0] print(result.success) # bool: True if score ≥ 0.75 print(result.summary) # human-readable verdict print(result.to_dict()) # full component breakdown (JSON-serializable) ``` Every `env.step()` call returns this grade in `info["task_grade"]`: ```python obs, reward, done, info = env.step(action) grade_result = info["task_grade"] # dict: {score, success, components, ...} ``` --- ## Why This Is Non-Trivial | Challenge | Description | |---|---| | **Multi-objective** | 5 rubric dimensions + task-specific grader — no single scalar fully captures the objective | | **Long-horizon** | 50-turn episodes; many actions have 5–10 turn lag before effects appear | | **Non-linear dynamics** | Social unrest cascade, hyperinflation multiplier, epidemic OOC penalty | | **Structural vs. tactical** | Gini responds slowly to redistribution; crime responds quickly to policing | | **Real-world data** | GDP growth, inflation, unemployment, life expectancy anchored to World Bank baseline | | **Emergent behaviour** | Wealth inequality → unrest → protest → GDP drag (3-step causal chain) |