# CivicAI — Real-World Problem Statement

## Problem Definition

> **AI-driven societal policy optimization under uncertainty**

Modern governments face a combinatorial decision-making problem: thousands of
interdependent policy levers (taxes, healthcare spending, education, policing,
subsidies, emergency responses) interact through complex causal chains to
produce emergent societal outcomes across economic, public-health, and social
cohesion dimensions — often with weeks-to-years of lag and high uncertainty.

No human decision-maker can simultaneously optimise all dimensions. AI agents
trained in CivicAI learn to:

1. Observe rich societal state (12+ indicators)
2. Act across a continuous multi-dimensional policy space
3. Receive delayed, multi-objective feedback
4. Adapt to unexpected shocks (pandemics, market crashes, social unrest)

---

## Real-World Domain Mapping

| CivicAI dimension | Real-world counterpart | Real data anchor |
|---|---|---|
| `gdp`, `gdp_growth`, `inflation` | Macroeconomic fiscal policy | World Bank GDP / IMF inflation data |
| `employment_rate` | Labour market policy | ILO unemployment statistics |
| `tax_rate`, `budget_balance` | Government revenue & deficit | OECD fiscal balance data |
| `health_index`, `infection_rate` | Public-health capacity & epidemics | WHO health expenditure / GHI |
| `crime_rate` | Rule-of-law & public safety | UNODC crime indices |
| `public_satisfaction` | Democratic legitimacy / approval | Edelman Trust Barometer |
| `emergent.wealth_inequality` | Distributional equity | Gini coefficient (World Bank) |
| `emergent.social_unrest` | Political stability | World Governance Indicators |
| `food_reserves`, `energy_reserves` | Strategic resource security | FAO / IEA stockpile data |
| `education_quality` | Human capital investment | UNESCO / PISA |

### Domain 1 — Governance (Fiscal Policy)

**Real-world problem:** Governments must set tax rates that raise revenue
without suppressing growth, and allocate budgets across competing public goods
(healthcare vs. education vs. security) while maintaining fiscal sustainability.

**CivicAI mapping:**
- Action: `tax_rate` ∈ [0, 1], `healthcare_budget`, `education_budget`, `police_budget`
- State: `gdp`, `inflation`, `employment_rate`, `budget_balance`
- Challenge: High taxes → GDP drag; low taxes → deficit spiral

### Domain 2 — Economy (Macroeconomic Stabilisation)

**Real-world problem:** Recessions require countercyclical stimulus, but
overspending triggers inflation. Optimal fiscal multipliers depend on the
current economic regime.

**CivicAI mapping:**
- Action: `subsidy_policy` ∈ {none, agriculture, industry, technology}
- State: `gdp_growth`, `inflation`, `employment_rate`
- Challenge: Technology subsidies boost long-run growth but worsen near-term
  inequality; agriculture subsidies improve food security but reduce GDP growth

### Domain 3 — Public Health (Epidemic Management)

**Real-world problem:** Pandemics create tradeoffs between infection
suppression (via lockdowns) and economic activity. Optimal policies depend on
medical supply capacity, infection dynamics, and public compliance.

**CivicAI mapping:**
- Action: `healthcare_budget`, `emergency_response` (lockdown / stimulus / open)
- State: `infection_rate`, `health_index`, `medical_supplies`, `gdp`
- Challenge: Lockdown reduces infection but crushes GDP; premature opening
  causes epidemic rebound

### Domain 4 — Social Cohesion (Crisis Management)

**Real-world problem:** Compound crises (unemployment + crime + inequality +
unrest) exhibit non-linear cascade dynamics: once social unrest exceeds a
threshold, even good economic data fails to restore stability.

**CivicAI mapping:**
- Action: All levers simultaneously; no single dominant strategy
- State: `public_satisfaction`, `crime_rate`, `emergent.wealth_inequality`,
  `emergent.social_unrest`
- Challenge: Inequality is a slow-moving structural variable; quick fixes
  (police budget) address symptoms, not causes

---

## Tasks

### Task 1 — Economic Stability `[EASY]`

**Objective:** Restore a mild recession economy to fiscal stability.

| Criterion | Target | Failure |
|---|---|---|
| Inflation | < 6% | ≥ 15% |
| Employment | > 85% | ≤ 65% |
| GDP | > $400B | ≤ $250B |
| Budget Balance | Surplus preferred | ≤ −30% deficit |

**Initial conditions:** GDP $450B, inflation 7%, employment 82%, satisfaction 55%

**Deterministic grader** (`EconomicStabilityGrader`):

```
score = 0.40 × inflation_score
      + 0.40 × employment_score
      + 0.10 × gdp_score
      + 0.10 × budget_score

inflation_score  = linear_inv(inflation, ideal=3%, fail=15%)
                   × 0.40 if hyperinflation (>20%)
employment_score = linear(employment_rate, fail=65%, ideal=90%)
gdp_score        = linear(gdp, fail=$250B, ideal=$500B)
budget_score     = linear(budget_balance, fail=−30%, ideal=0%)

All linear() / linear_inv() produce values in [0.0, 1.0].
No random calls. Always deterministic.
```

**Success threshold:** score ≥ 0.75

---

### Task 2 — Pandemic Management `[MEDIUM]`

**Objective:** Suppress a 20% infection-rate epidemic without destroying the
economy.

| Criterion | Target | Failure |
|---|---|---|
| Infection rate | < 10% | ≥ 30% |
| Health index | > 0.60 | ≤ 0.30 |
| GDP | > $300B | ≤ $200B |
| Medical supplies | > 0.60 | ≤ 0.20 |

**Initial conditions:** Infection 20%, health index 0.55, GDP $480B, medical supplies 0.50

**Deterministic grader** (`PandemicManagementGrader`):

```
score = 0.40 × infection_score
      + 0.30 × health_score
      + 0.20 × gdp_score
      + 0.10 × supplies_score

infection_score = linear_inv(infection_rate, ideal=2%, fail=30%)
                  × 0.50 if epidemic OOC (≥40%)
health_score    = linear(health_index, fail=0.30, ideal=0.80)
gdp_score       = linear(gdp, fail=$200B, ideal=$480B)
supplies_score  = linear(medical_supplies, fail=0.20, ideal=0.80)

No random calls. Always deterministic.
```

**Core tension:** Lockdown ↑ infection_score but ↓ gdp_score — agent must
find the optimal tradeoff trajectory.

**Success threshold:** score ≥ 0.75

---

### Task 3 — Social Stability Crisis `[HARD]`

**Objective:** Restore social order from a compound multi-domain crisis with
cascading failure risk.

| Criterion | Target | Failure |
|---|---|---|
| Public satisfaction | > 50% | ≤ 15% |
| Crime rate | < 12% | ≥ 35% |
| Employment rate | > 80% | ≤ 55% |
| Wealth inequality (Gini) | < 0.40 | ≥ 0.70 |

**Initial conditions:** Employment 68%, crime 25%, satisfaction 30%, Gini 0.55, social unrest 0.45

**Deterministic grader** (`SocialCrisisGrader`):

```
score = 0.30 × satisfaction_score
      + 0.25 × crime_score
      + 0.25 × employment_score
      + 0.20 × inequality_score
      × 0.60 if social_unrest > 0.65 (cascade penalty)

satisfaction_score  = linear(public_satisfaction, fail=0.15, ideal=0.70)
crime_score         = linear_inv(crime_rate, ideal=5%, fail=35%)
                      × 0.50 if crime_rate ≥ 40%
employment_score    = linear(employment_rate, fail=55%, ideal=88%)
inequality_score    = linear_inv(gini, ideal=0.20, fail=0.70)

No random calls. Always deterministic.
```

**Why it's hard:**
- Gini is structural — requires sustained tax redistribution over many turns
- Social unrest cascade multiplier punishes instability even when individual
  metrics improve
- No single dominant strategy; agents must balance all four dimensions
  simultaneously

**Success threshold:** score ≥ 0.75

---

## Grader API

```python
from civicai.graders import grade, GradeResult

result: GradeResult = grade(state, task_id="stabilize_economy")

print(result.score)        # float ∈ [0.0, 1.0]
print(result.success)      # bool: True if score ≥ 0.75
print(result.summary)      # human-readable verdict
print(result.to_dict())    # full component breakdown (JSON-serializable)
```

Every `env.step()` call returns this grade in `info["task_grade"]`:

```python
obs, reward, done, info = env.step(action)
grade_result = info["task_grade"]   # dict: {score, success, components, ...}
```

---

## Why This Is Non-Trivial

| Challenge | Description |
|---|---|
| **Multi-objective** | 5 rubric dimensions + task-specific grader — no single scalar fully captures the objective |
| **Long-horizon** | 50-turn episodes; many actions have 5–10 turn lag before effects appear |
| **Non-linear dynamics** | Social unrest cascade, hyperinflation multiplier, epidemic OOC penalty |
| **Structural vs. tactical** | Gini responds slowly to redistribution; crime responds quickly to policing |
| **Real-world data** | GDP growth, inflation, unemployment, life expectancy anchored to World Bank baseline |
| **Emergent behaviour** | Wealth inequality → unrest → protest → GDP drag (3-step causal chain) |