Spaces:

mahammadaftab
/

CivicAI

Sleeping

App Files Files Community

CivicAI / PROBLEM_STATEMENT.md

mahammadaftab

Final updated

6298125 11 days ago

preview code

raw

history blame contribute delete

8.85 kB

CivicAI — Real-World Problem Statement

Problem Definition

AI-driven societal policy optimization under uncertainty

Modern governments face a combinatorial decision-making problem: thousands of interdependent policy levers (taxes, healthcare spending, education, policing, subsidies, emergency responses) interact through complex causal chains to produce emergent societal outcomes across economic, public-health, and social cohesion dimensions — often with weeks-to-years of lag and high uncertainty.

No human decision-maker can simultaneously optimise all dimensions. AI agents trained in CivicAI learn to:

Observe rich societal state (12+ indicators)
Act across a continuous multi-dimensional policy space
Receive delayed, multi-objective feedback
Adapt to unexpected shocks (pandemics, market crashes, social unrest)

Real-World Domain Mapping

CivicAI dimension	Real-world counterpart	Real data anchor
`gdp`, `gdp_growth`, `inflation`	Macroeconomic fiscal policy	World Bank GDP / IMF inflation data
`employment_rate`	Labour market policy	ILO unemployment statistics
`tax_rate`, `budget_balance`	Government revenue & deficit	OECD fiscal balance data
`health_index`, `infection_rate`	Public-health capacity & epidemics	WHO health expenditure / GHI
`crime_rate`	Rule-of-law & public safety	UNODC crime indices
`public_satisfaction`	Democratic legitimacy / approval	Edelman Trust Barometer
`emergent.wealth_inequality`	Distributional equity	Gini coefficient (World Bank)
`emergent.social_unrest`	Political stability	World Governance Indicators
`food_reserves`, `energy_reserves`	Strategic resource security	FAO / IEA stockpile data
`education_quality`	Human capital investment	UNESCO / PISA

Domain 1 — Governance (Fiscal Policy)

Real-world problem: Governments must set tax rates that raise revenue without suppressing growth, and allocate budgets across competing public goods (healthcare vs. education vs. security) while maintaining fiscal sustainability.

CivicAI mapping:

Action: tax_rate ∈ [0, 1], healthcare_budget, education_budget, police_budget
State: gdp, inflation, employment_rate, budget_balance
Challenge: High taxes → GDP drag; low taxes → deficit spiral

Domain 2 — Economy (Macroeconomic Stabilisation)

Real-world problem: Recessions require countercyclical stimulus, but overspending triggers inflation. Optimal fiscal multipliers depend on the current economic regime.

CivicAI mapping:

Action: subsidy_policy ∈ {none, agriculture, industry, technology}
State: gdp_growth, inflation, employment_rate
Challenge: Technology subsidies boost long-run growth but worsen near-term inequality; agriculture subsidies improve food security but reduce GDP growth

Domain 3 — Public Health (Epidemic Management)

Real-world problem: Pandemics create tradeoffs between infection suppression (via lockdowns) and economic activity. Optimal policies depend on medical supply capacity, infection dynamics, and public compliance.

CivicAI mapping:

Action: healthcare_budget, emergency_response (lockdown / stimulus / open)
State: infection_rate, health_index, medical_supplies, gdp
Challenge: Lockdown reduces infection but crushes GDP; premature opening causes epidemic rebound

Domain 4 — Social Cohesion (Crisis Management)

Real-world problem: Compound crises (unemployment + crime + inequality + unrest) exhibit non-linear cascade dynamics: once social unrest exceeds a threshold, even good economic data fails to restore stability.

CivicAI mapping:

Action: All levers simultaneously; no single dominant strategy
State: public_satisfaction, crime_rate, emergent.wealth_inequality, emergent.social_unrest
Challenge: Inequality is a slow-moving structural variable; quick fixes (police budget) address symptoms, not causes

Tasks

Task 1 — Economic Stability `[EASY]`

Objective: Restore a mild recession economy to fiscal stability.

Criterion	Target	Failure
Inflation	< 6%	≥ 15%
Employment	> 85%	≤ 65%
GDP	> $400B	≤ $250B
Budget Balance	Surplus preferred	≤ −30% deficit

Initial conditions: GDP $450B, inflation 7%, employment 82%, satisfaction 55%

Deterministic grader (EconomicStabilityGrader):

score = 0.40 × inflation_score
      + 0.40 × employment_score
      + 0.10 × gdp_score
      + 0.10 × budget_score

inflation_score  = linear_inv(inflation, ideal=3%, fail=15%)
                   × 0.40 if hyperinflation (>20%)
employment_score = linear(employment_rate, fail=65%, ideal=90%)
gdp_score        = linear(gdp, fail=$250B, ideal=$500B)
budget_score     = linear(budget_balance, fail=−30%, ideal=0%)

All linear() / linear_inv() produce values in [0.0, 1.0].
No random calls. Always deterministic.

Success threshold: score ≥ 0.75

Task 2 — Pandemic Management `[MEDIUM]`

Objective: Suppress a 20% infection-rate epidemic without destroying the economy.

Criterion	Target	Failure
Infection rate	< 10%	≥ 30%
Health index	> 0.60	≤ 0.30
GDP	> $300B	≤ $200B
Medical supplies	> 0.60	≤ 0.20

Initial conditions: Infection 20%, health index 0.55, GDP $480B, medical supplies 0.50

Deterministic grader (PandemicManagementGrader):

score = 0.40 × infection_score
      + 0.30 × health_score
      + 0.20 × gdp_score
      + 0.10 × supplies_score

infection_score = linear_inv(infection_rate, ideal=2%, fail=30%)
                  × 0.50 if epidemic OOC (≥40%)
health_score    = linear(health_index, fail=0.30, ideal=0.80)
gdp_score       = linear(gdp, fail=$200B, ideal=$480B)
supplies_score  = linear(medical_supplies, fail=0.20, ideal=0.80)

No random calls. Always deterministic.

Core tension: Lockdown ↑ infection_score but ↓ gdp_score — agent must find the optimal tradeoff trajectory.

Success threshold: score ≥ 0.75

Task 3 — Social Stability Crisis `[HARD]`

Objective: Restore social order from a compound multi-domain crisis with cascading failure risk.

Criterion	Target	Failure
Public satisfaction	> 50%	≤ 15%
Crime rate	< 12%	≥ 35%
Employment rate	> 80%	≤ 55%
Wealth inequality (Gini)	< 0.40	≥ 0.70

Initial conditions: Employment 68%, crime 25%, satisfaction 30%, Gini 0.55, social unrest 0.45

Deterministic grader (SocialCrisisGrader):

score = 0.30 × satisfaction_score
      + 0.25 × crime_score
      + 0.25 × employment_score
      + 0.20 × inequality_score
      × 0.60 if social_unrest > 0.65 (cascade penalty)

satisfaction_score  = linear(public_satisfaction, fail=0.15, ideal=0.70)
crime_score         = linear_inv(crime_rate, ideal=5%, fail=35%)
                      × 0.50 if crime_rate ≥ 40%
employment_score    = linear(employment_rate, fail=55%, ideal=88%)
inequality_score    = linear_inv(gini, ideal=0.20, fail=0.70)

No random calls. Always deterministic.

Why it's hard:

Gini is structural — requires sustained tax redistribution over many turns
Social unrest cascade multiplier punishes instability even when individual metrics improve
No single dominant strategy; agents must balance all four dimensions simultaneously

Success threshold: score ≥ 0.75

Grader API

from civicai.graders import grade, GradeResult

result: GradeResult = grade(state, task_id="stabilize_economy")

print(result.score)        # float ∈ [0.0, 1.0]
print(result.success)      # bool: True if score ≥ 0.75
print(result.summary)      # human-readable verdict
print(result.to_dict())    # full component breakdown (JSON-serializable)

Every env.step() call returns this grade in info["task_grade"]:

obs, reward, done, info = env.step(action)
grade_result = info["task_grade"]   # dict: {score, success, components, ...}

Why This Is Non-Trivial

Challenge	Description
Multi-objective	5 rubric dimensions + task-specific grader — no single scalar fully captures the objective
Long-horizon	50-turn episodes; many actions have 5–10 turn lag before effects appear
Non-linear dynamics	Social unrest cascade, hyperinflation multiplier, epidemic OOC penalty
Structural vs. tactical	Gini responds slowly to redistribution; crime responds quickly to policing
Real-world data	GDP growth, inflation, unemployment, life expectancy anchored to World Bank baseline
Emergent behaviour	Wealth inequality → unrest → protest → GDP drag (3-step causal chain)

CivicAI — Real-World Problem Statement

Problem Definition

Real-World Domain Mapping

Domain 1 — Governance (Fiscal Policy)

Domain 2 — Economy (Macroeconomic Stabilisation)

Domain 3 — Public Health (Epidemic Management)

Domain 4 — Social Cohesion (Crisis Management)

Tasks

Task 1 — Economic Stability [EASY]

Task 2 — Pandemic Management [MEDIUM]

Task 3 — Social Stability Crisis [HARD]

Grader API

Why This Is Non-Trivial

Task 1 — Economic Stability `[EASY]`

Task 2 — Pandemic Management `[MEDIUM]`

Task 3 — Social Stability Crisis `[HARD]`