aksara-tokenizer-v2 / tokenizer_config.json
Ezekiel999's picture
Upload tokenizer_config.json with huggingface_hub
4b3b114 verified
{
"version": "2.0",
"vocab_size": 32768,
"pre_tokenizer": "Whitespace",
"normalizer": "NFKC+Lowercase",
"model": "BPE",
"optimized_for": "small models <= 200M"
}