domainTokenizer / src /domain_tokenizer

Commit History

Update package init to v0.2.0 with model exports
b86b1ee
verified

rtferraz commited on

Add DCNv2 + JointFusionModel (nuFormer-style Transformer + tabular fusion)
e881ea3
verified

rtferraz commited on

Add PLR embeddings (Gorishniy et al. 2022)
d685c0e
verified

rtferraz commited on

Add DomainTransformerForCausalLM β€” GPT-style NoPE model with SDPA attention, weight tying, HF Trainer compatible
0dec8e4
verified

rtferraz commited on

Add DomainTransformerConfig with presets (24M/85M/330M)
15fbfea
verified

rtferraz commited on

Phase 2B: Model architecture β€” DomainTransformerForCausalLM (NoPE, GPT-style), PLR embeddings, DCNv2 + JointFusion, 105 passing tests
2f5969e
verified

rtferraz commited on

Add predefined schemas (FINANCE, ECOMMERCE, HEALTHCARE)
c00ac2c
verified

rtferraz commited on

Add domain_tokenizer.py β€” DomainTokenizerBuilder (core assembler, HF integration)
818a2e9
verified

rtferraz commited on

Add field_tokenizers.py β€” Sign, MagnitudeBucket, Calendar, Categorical, DiscreteNumerical tokenizers
511f3aa
verified

rtferraz commited on

Add tokenizers package init
0b06df3
verified

rtferraz commited on

Add schemas package init
04b1d24
verified

rtferraz commited on

Add schema.py β€” DomainSchema, FieldSpec, FieldType definitions
1a9dad0
verified

rtferraz commited on

Phase 2A: Core tokenizer library β€” schema, field tokenizers, composite builder, predefined schemas, 72 passing tests
0c1ca58
verified

rtferraz commited on