Spaces:
Sleeping
Sleeping
File size: 10,542 Bytes
2312199 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | {
"task_id": "hard_05",
"version": "1.0.0",
"created_at": "2026-03-11",
"metadata": {
"domain": "credit_card_optimization",
"difficulty": "hard",
"task_number": 5,
"complexity_hint": {
"max_tokens": 8000,
"expected_output": "complex strategy with contingency plans, stochastic elements, detailed EV"
},
"requires_human_review": true
},
"prompt": {
"system": "",
"user": "You are a financial optimization agent tasked with designing the optimal 36-month credit card strategy for a household.\n\nYour goal is to maximize the household\u2019s net expected financial value over the next 3 years.\n\nNet value includes:\n\n* signup bonuses \n* ongoing rewards \n* travel redemption value \n* statement credits that will realistically be used\n\nNet value must subtract:\n\n* annual fees \n* opportunity costs of locking out future bonuses \n* interest if spend requirements exceed realistic spending capacity\n\n**Household Profile:**\n\nPerson A\n\n* Age 38 \n* Credit score: 780 \n* Income: $450k \n* Currently holds 7 cards\n\nPerson B\n\n* Age 36 \n* Credit score: 720 \n* Income: $120k \n* Currently holds 4 cards\n\n**Combined monthly spend:**\n\n| category | monthly |\n| ----- | ----- |\n| airfare / hotels | $4,200 |\n| dining | $1,600 |\n| groceries | $1,200 |\n| rideshare | $600 |\n| business ads / SaaS | $2,500 |\n| utilities | $900 |\n| other | $2,500 |\n\nTotal monthly spend: $13,500\n\n**Current Cards:**\n\nPerson A\n\n* Amex Platinum (3 years old) \n* Amex Gold (2 years old) \n* Chase Sapphire Preferred (45 months old) \n* Capital One Venture X (1 year old) \n* Chase Ink Preferred (7mo old) \n* Citi Double Cash (4 mo old) \n* Marriott Bonvoy Boundless (18mo old)\n\nPerson B\n\n* Chase Freedom Unlimited (10 years old) \n* Amex Blue Business Plus (8 years old) \n* Capital One Venture (6 years old) \n* Citi Premier (2.5 years old)\n\n**Constraints**\n\n* Household may have no more than 10 cards total at any time. \n* Household annual fee budget must remain below $1,200 per year. \n* Household wants to minimize complexity: \n* No more than 3 active reward ecosystems. \n* Approval probability must remain above 55% for every new card application. \n* No more than 3 card applications within any rolling 90-day window. \n* All spend used to meet signup bonuses must remain feasible given actual household spending. \n* Household prefers transferable point ecosystems, but cashback is allowed if EV is at least 10% higher.\n\n**Travel Goals**\n\nEvery year for the next 3 years:\n\n* 2 business class flights NYC \u2192 Tokyo (ANA flight) \n* 2 economy flights NYC \u2192 Paris (KLM flight) \n* several domestic flights (agnostic toward American/Delta/United)\n\nFlights should be booked using points whenever EV is higher than paying cash.\n\nKeep in mind the dataset contains:\n\n* airline alliances \n* transfer partners \n* redemption price distributions \n* transfer ratios\n\n**Offer Uncertainty**\n\nHistorical data shows some cards periodically have elevated signup bonuses.\n\nExample distributions:\n\n| Card | Elevated bonus probability |\n| ----- | ----- |\n| Amex Platinum | 35% chance of 150k annually |\n| Chase Sapphire Reserve | 20% chance of 80k |\n| Venture X | 15% chance of 100k |\n\nYou may choose to delay card applications if expected value improves.\n\n**Credit Score Dynamics**\n\nOpening a card temporarily:\n\n* reduces credit score by 5 points immediately, but recovers in 3 months, and in 6 months it increases score by 5 points because of decreased utilization from more access to credit\n\nClosing a card may:\n\n* reduce available credit \n* affect approval probability\n\n**Manufactured Spend Opportunity**\n\nPerson A has access to up to $8k/mo of manufactured spend, however some issuers flag high MS \n(over $3k/mo). Surpassing that risks issuer shutdown.\n\n**Task**\n\nProduce the optimal 36-month strategy including:\n\n* which cards each person should open and when \n* which cards should be downgraded or cancelled \n* how spending should be allocated across cards \n* which reward ecosystems should be prioritized \n* which travel redemptions should be used \n* whether any card applications should be delayed to wait for elevated bonuses\n\nYour strategy must include:\n\n* a month-by-month timeline and spending plan by card \n* expected reward value for each decision \n* final expected net value after 36 months \n* a backup suggestion for each application in case it gets rejected",
"knowledge_base_ref": "knowledge_base.md",
"kb_filter": [
"Chase Ink Business Unlimited",
"Chase Ink Business Preferred",
"Chase Ink Business Premier",
"Capital One Venture X",
"Capital One Venture X Business",
"Amex Business Gold",
"Amex Business Platinum",
"Capital One Business Spark 2X Cash",
"Citi Strata",
"Citi Strata Premier",
"BofA Premium Rewards",
"BofA Premium Rewards Elite",
"Chase World of Hyatt",
"Chase United Explorer",
"Chase United Gateway",
"Atmos Rewards Summit",
"Amex Delta SkyMiles Gold",
"Amex Delta SkyMiles Platinum",
"Amex Delta SkyMiles Reserve",
"Amex Green",
"American Express Platinum",
"American Express Gold",
"Chase Sapphire Preferred",
"Chase Freedom Unlimited",
"Amex Blue Business Plus",
"Capital One Venture",
"Citi Double Cash",
"Chase Marriott Bonvoy Boundless",
"Chase Sapphire Reserve",
"Chase Business Sapphire Reserve",
"Capital One Business Spark 2X Miles",
"Amex Blue Business Cash"
],
"system_prompt_ref": "system_prompt_template.md"
},
"scoring": {
"dimensions": {
"constraint_compliance": {
"weight": 0.3,
"type": "automated",
"description": "Hard rule checks: velocity limits, eligibility, user constraints",
"checks": {
"velocity_rules": null,
"eligibility_rules": null,
"user_constraints": null,
"expected_cards": [
"Chase Ink Business Unlimited",
"Chase Ink Business Preferred",
"Chase Ink Business Premier",
"Capital One Venture X",
"Capital One Venture X Business",
"Amex Business Gold",
"Amex Business Platinum",
"Capital One Business Spark 2X Cash",
"Citi Strata",
"Citi Strata Premier",
"BofA Premium Rewards",
"BofA Premium Rewards Elite",
"Chase World of Hyatt",
"Chase United Explorer",
"Chase United Gateway",
"Atmos Rewards Summit",
"Amex Delta SkyMiles Gold",
"Amex Delta SkyMiles Platinum",
"Amex Delta SkyMiles Reserve",
"Amex Green",
"American Express Platinum",
"American Express Gold"
],
"expected_housing_option": null,
"key_constraints_flags": [
"dual_person_household",
"10_card_limit",
"1200_annual_fee_budget",
"3_ecosystems_max",
"90_day_velocity",
"manufactured_spend",
"elevated_offer_probability",
"travel_redemption_goals",
"36_month_horizon"
]
},
"hard_constraint": false
},
"ev_accuracy": {
"weight": 0.4,
"type": "automated",
"description": "EV calculation accuracy vs. reference solution",
"reference": {
"reference_ev_usd": 40693.0,
"ev_tolerance_pct": 0.05
}
},
"reasoning_quality": {
"weight": 0.2,
"type": "human",
"description": "Quality of tradeoff articulation and strategic reasoning (0-3 scale)",
"rubric": {
"0": "No reasoning or incorrect reasoning",
"1": "Surface-level reasoning, misses key tradeoffs",
"2": "Correct tradeoffs identified with clear justification",
"3": "Expert-level nuance including edge cases and constraint interactions"
},
"score": null
},
"constraint_prioritization": {
"weight": 0.1,
"type": "human",
"description": "Correct handling of ambiguity and conflicting constraints",
"score": null
}
},
"passing_threshold": 0.6,
"hard_constraint_failure_zeroes_dimension": true
},
"reference_solution": {
"_status": "EXPERT_REVIEWED",
"recommended_cards": [
"Chase Ink Business Unlimited",
"Chase Ink Business Preferred",
"Chase Ink Business Premier",
"Capital One Venture X",
"Capital One Venture X Business",
"Amex Business Gold",
"Amex Business Platinum",
"Capital One Business Spark 2X Cash",
"Citi Strata",
"Citi Strata Premier",
"BofA Premium Rewards",
"BofA Premium Rewards Elite",
"Chase World of Hyatt",
"Chase United Explorer",
"Chase United Gateway",
"Atmos Rewards Summit",
"Amex Delta SkyMiles Gold",
"Amex Delta SkyMiles Platinum",
"Amex Delta SkyMiles Reserve",
"Amex Green",
"American Express Platinum",
"American Express Gold"
],
"total_ev_usd": 40693.0,
"ev_breakdown": {
"signup_bonuses_usd": 25003.0,
"ongoing_rewards_usd": 20794.0,
"credits_usd": 0.0,
"annual_fees_usd": -5104.0,
"other_usd": 0.0
},
"housing_option": null,
"key_constraints_flags": [
"dual_person_household",
"10_card_limit",
"1200_annual_fee_budget",
"3_ecosystems_max",
"90_day_velocity",
"manufactured_spend",
"elevated_offer_probability",
"travel_redemption_goals",
"36_month_horizon"
],
"expert_notes": "36-month dual-person strategy. Partner A (business eligible, MS access): $17,331 EV. Partner B: $23,362 EV. Combined: $40,693. Each partner opens 1 card per 90 days (12 cards each over 36mo). Partner A: Chase biz cards first, then Cap1/Amex biz, Citi/BofA personal. Partner B: Chase personal first (Hyatt/United), then airline/bank cards, then Amex personal. Cancel all cards at 1yr to avoid 2nd annual fee. Non-bonus spend on Venture X. Partner A MS of $8k/mo = $288k extra over 36mo. Keep MS <$3k/mo per issuer to avoid shutdown flags. Elevated bonus probabilities factored into Partner B's Amex Plat (104.5k expected) and Venture X (78.75k expected) EVs."
}
}
|