Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
rtferraz
/
domainTokenizer
like
0
arxiv:
9 papers
Model card
Files
Files and versions
xet
Community
d685c0e
domainTokenizer
/
src
/
domain_tokenizer
53.1 kB
Ctrl+K
Ctrl+K
1 contributor
History:
11 commits
rtferraz
Add PLR embeddings (Gorishniy et al. 2022)
d685c0e
verified
13 days ago
models
Add PLR embeddings (Gorishniy et al. 2022)
13 days ago
schemas
Add predefined schemas (FINANCE, ECOMMERCE, HEALTHCARE)
13 days ago
tokenizers
Add domain_tokenizer.py โ DomainTokenizerBuilder (core assembler, HF integration)
13 days ago
__init__.py
Safe
895 Bytes
Phase 2A: Core tokenizer library โ schema, field tokenizers, composite builder, predefined schemas, 72 passing tests
13 days ago
schema.py
Safe
7.49 kB
Add schema.py โ DomainSchema, FieldSpec, FieldType definitions
13 days ago