Add DomainTransformerConfig with presets (24M/85M/330M) 15fbfea verified rtferraz commited on 10 days ago
Phase 2B: Model architecture — DomainTransformerForCausalLM (NoPE, GPT-style), PLR embeddings, DCNv2 + JointFusion, 105 passing tests 2f5969e verified rtferraz commited on 10 days ago