Lgr54HFi's picture
fix: v12 GENESIS β€” fix 6 interaction bugs between paradigms\n\n1. P13 MTP heads added to optimizer (were dead β€” never updated)\n2. P18 Grokfast: skip Muon 2D params (NS normalisation cancels amplification)\n Apply only to 1D/embed params where AdamW preserves the signal\n3. P16 Plateau: save/restore ALL group LRs (was destroying LLRD ratios)\n4. P15 Token Triage applied to MTP loss too (was only on base loss)\n5. P16 Plateau: gentler burst Γ—2 instead of Γ—3 (Grokfast already amplifies)\n6. P15 Triage: per-position EMA disabled, use global excess only"
cf64132 verified