SFT/Alignment - Phase 007-06-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch)
#9
by mrs83 - opened
Errata corrige: that's v0.7.6 not v0.7.5. v0.7.5 was still on Finetome
mrs83 changed discussion title from SFT/Alignment - Phase 007-05-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch) to SFT/Alignment - Phase 007-06-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch)
Look, linear GPU MEM! ๐ Just running an lm_eval with batch_size=16 on the new architecture.
Notice the orange line on the left. That is a perfectly flat, constant allocation of ~15GB out of 96GB, while compute (blue line) is pinned at 100%.
By shifting to a custom 3-Pass Triton kernel the memory footprint now scales linearly, allowing us to hold massive batch sizes and long contexts entirely in memory without triggering PyTorch's garbage collection or OOMs.

