Commit History

Add DCNv2 + JointFusionModel (nuFormer-style Transformer + tabular fusion)
e881ea3
verified

rtferraz commited on

Add PLR embeddings (Gorishniy et al. 2022)
d685c0e
verified

rtferraz commited on

Add DomainTransformerForCausalLM β€” GPT-style NoPE model with SDPA attention, weight tying, HF Trainer compatible
0dec8e4
verified

rtferraz commited on

Add DomainTransformerConfig with presets (24M/85M/330M)
15fbfea
verified

rtferraz commited on

Phase 2B: Model architecture β€” DomainTransformerForCausalLM (NoPE, GPT-style), PLR embeddings, DCNv2 + JointFusion, 105 passing tests
2f5969e
verified

rtferraz commited on

Add comprehensive test suite β€” 72 passing tests covering all components
8efa945
verified

rtferraz commited on

Add predefined schemas (FINANCE, ECOMMERCE, HEALTHCARE)
c00ac2c
verified

rtferraz commited on

Add domain_tokenizer.py β€” DomainTokenizerBuilder (core assembler, HF integration)
818a2e9
verified

rtferraz commited on

Add field_tokenizers.py β€” Sign, MagnitudeBucket, Calendar, Categorical, DiscreteNumerical tokenizers
511f3aa
verified

rtferraz commited on

Add tokenizers package init
0b06df3
verified

rtferraz commited on

Add schemas package init
04b1d24
verified

rtferraz commited on

Add schema.py β€” DomainSchema, FieldSpec, FieldType definitions
1a9dad0
verified

rtferraz commited on

Phase 2A: Core tokenizer library β€” schema, field tokenizers, composite builder, predefined schemas, 72 passing tests
0c1ca58
verified

rtferraz commited on

Update README: add ADR reference, update documentation table and repo structure
a239d6e
verified

rtferraz commited on

Add ADR-001: Implementation framework decision with detailed roadmap
25a1093
verified

rtferraz commited on

Update README with Nubank case study and expanded repo structure
e30a14d
verified

rtferraz commited on

Add Nubank nuFormer reverse-engineering analysis β€” full pipeline reconstruction
51149fa
verified

rtferraz commited on

Add README with project overview and vision
f930fef
verified

rtferraz commited on

Add comprehensive research report on domain-specific tokenization
be86e60
verified

rtferraz commited on

initial commit
356a72e
verified

rtferraz commited on