Commit History

Add DomainTransformerForCausalLM — GPT-style NoPE model with SDPA attention, weight tying, HF Trainer compatible
0dec8e4
verified

rtferraz commited on