chomera / chimera /training /loops.py

Commit History

fix: print every step + first-step timing to diagnose slow forward
5b5a08d
verified

Lgr54HFi commited on

fix: OOM at batch=256 β€” cap batch by logits memory, enable grad ckpt
5bfbb8a
verified

Lgr54HFi commited on

perf: tune train_hyper_loop for 300-step convergence
9d8c566
verified

Lgr54HFi commited on

Fix loss rebound: lower Muon LR (0.02β†’0.008), clamp ternary latents, steeper cosine decay
e4d9588
verified

Lgr54HFi commited on

Upload chimera/training/loops.py
6d5c935
verified

Lgr54HFi commited on

Fix NaN loss reporting: show nan instead of 0.0 when all steps in window are NaN
8e41f12
verified

Lgr54HFi commited on

Upload chimera/training/loops.py
edcdcb3
verified

Lgr54HFi commited on

feat: loops.py v11 β€” aligned with GENESIS engine, no distiller overhead"
3859a82
verified

Lgr54HFi commited on

feat: loops.py β€” integrate Muon + MTP + EMA distillation in training loop"
9897d01
verified

Lgr54HFi commited on

feat: train_hyper_loop with progressive looping, evolution loss feedback, no progressive_unfreeze default\n\nActivates dormant ch1mera paradigms:\n1. Progressive looping: 1β†’2β†’3 Parcae loops during training\n2. Evolution receives prev_loss for surprise-based memory writes\n3. progressive_unfreeze disabled by default (all layers train from start)\n4. Logs loop count and NaN-safe averaging"
b6bcd75
verified

Lgr54HFi commited on

fix: loops.py β€” use chimera_turbo v8 defaults (wd=0.01, warmup=750, Ξ²2=0.98) instead of hardcoded values"
e2f5e25
verified

Lgr54HFi commited on

fix: re-enable torch.compile in train_hyper_loop (STE graph breaks fixed)"
f6670ea
verified

Lgr54HFi commited on

fix: train_hyper_loop grad_accum=1 (DataLoader already batches), better tok/s logging
31d69ba
verified

Lgr54HFi commited on

Upload folder using huggingface_hub
11c11f8
verified

Lgr54HFi commited on