Add predefined schemas (FINANCE, ECOMMERCE, HEALTHCARE) c00ac2c verified rtferraz commited on 10 days ago
Add domain_tokenizer.py β DomainTokenizerBuilder (core assembler, HF integration) 818a2e9 verified rtferraz commited on 10 days ago
Add field_tokenizers.py β Sign, MagnitudeBucket, Calendar, Categorical, DiscreteNumerical tokenizers 511f3aa verified rtferraz commited on 10 days ago
Add schema.py β DomainSchema, FieldSpec, FieldType definitions 1a9dad0 verified rtferraz commited on 10 days ago
Phase 2A: Core tokenizer library β schema, field tokenizers, composite builder, predefined schemas, 72 passing tests 0c1ca58 verified rtferraz commited on 10 days ago
Update README: add ADR reference, update documentation table and repo structure a239d6e verified rtferraz commited on 10 days ago
Add ADR-001: Implementation framework decision with detailed roadmap 25a1093 verified rtferraz commited on 10 days ago
Update README with Nubank case study and expanded repo structure e30a14d verified rtferraz commited on 10 days ago
Add Nubank nuFormer reverse-engineering analysis β full pipeline reconstruction 51149fa verified rtferraz commited on 10 days ago
Add comprehensive research report on domain-specific tokenization be86e60 verified rtferraz commited on 10 days ago