rtferraz's picture
Phase 2B: Model architecture — DomainTransformerForCausalLM (NoPE, GPT-style), PLR embeddings, DCNv2 + JointFusion, 105 passing tests
2f5969e verified
raw
history blame
685 Bytes
"""
Model components for domainTokenizer.
- DomainTransformerConfig: HF-compatible configuration
- DomainTransformerForCausalLM: GPT-style causal decoder (NoPE)
- PeriodicLinearReLU: PLR numerical embeddings (Gorishniy et al. 2022)
- JointFusionModel: nuFormer-style Transformer + DCNv2 fusion
"""
from .configuration import DomainTransformerConfig
from .modeling import (
DomainTransformerPreTrainedModel,
DomainTransformerModel,
DomainTransformerForCausalLM,
DomainTransformerAttention,
DomainTransformerMLP,
DomainTransformerBlock,
)
from .plr_embeddings import PeriodicLinearReLU
from .joint_fusion import DCNv2CrossLayer, DCNv2, JointFusionModel