feat: train_hyper.py v3 β full architecture, optimized forward + MeZO, no features cut 8e88097 verified Lgr54HFi commited on 12 days ago
fix: train_hyper.py v2 β lean mode, reduced layers, no overhead, 10k+ tok/s target dc90255 verified Lgr54HFi commited on 12 days ago
feat: add train_hyper.py β 7-paradigm stacked training for 10k+ tok/s on CPU f9d5ad9 verified Lgr54HFi commited on 12 days ago