Commit History

remove embedded hugging face tokens
3a2f322

nmstech commited on

merge github main before publish
15d9001

nmstech commited on

use zemberek-python and add regression tests
5a6f887

nmstech commited on

Upload folder using huggingface_hub
532470d
verified

nmstech commited on

Fix README example output and punctuation status
b209242

nmstech commited on

Upload folder using huggingface_hub
3a27be3
verified

nmstech commited on

Update TR-MMLU benchmark score to 92.64%
6a4e31f

nmstech commited on

Upload folder using huggingface_hub
b719e3c
verified

nmstech commited on

Migrate to zemberek-python, remove JVM dependency and 31MB JAR, apply O(N^2) init fix
8c72d18

nmstech commited on

Update README.md
2064cba
verified

nmstech commited on

Zemberek dosyasini gercekten ekliyorum
3e2daf4

nmstech commited on

Merge ve dosya düzenleme
cc4bd82

nmstech commited on

Zemberek jar dosyası başarıyla eklendi
3bcedcf

nmstech commited on

zemberek temizlik
03804fd

nmstech commited on

Delete nedo_turkish_tokenizer/data/zemberek_yeni.jar
e6da48e
verified

nmstech commited on

Delete .claude
b8c4228
verified

nmstech commited on

Zemberek dosyası isim değiştirilerek eklendi
90f2d92

nmstech commited on

Zemberek LFS ile eklendi
fae9ef5

nmstech commited on

Delete nedo_turkish_tokenizer/data/zemberek-full.jar
b27c27d
unverified

NMS commited on

Delete .claude directory
6b07da5
unverified

NMS commited on

Rename project from TurkTokenizer to NedoTurkishTokenizer
cfffd93

nmstech Claude Opus 4.6 commited on

Update README.md
e430fca
verified

nmstech commited on

Update README.md
92ffed4
verified

nmstech commited on

Add smart ACRONYM detection: TDK-based disambiguation for uppercase tokens
fcd513a
verified

nmstech commited on

Fix broken placeholder mechanism: replace with segment-based tokenization
58e6961
verified

nmstech commited on

Upload turk_tokenizer/data/turkish_proper_nouns.txt with huggingface_hub
986e073
verified

nmstech commited on

Upload turk_tokenizer/_preprocessor.py with huggingface_hub
8f794ec
verified

nmstech commited on

Replace hardcoded base lists with TDK vocab lookup in apostrophe split
183e656
verified

nmstech commited on

Fix İ lowercase bug + apostrophe merge for BPE-split foreign words
63dbb3f
verified

nmstech commited on

Fix İ lowercase bug + apostrophe merge for BPE-split foreign words
6330193
verified

nmstech commited on

Load TDK words from HF repo, fallback to TDK API
b9c10fd
verified

nmstech commited on

Add TDK word list (64K words)
9ec9d90
verified

nmstech commited on

Fix build backend: setuptools.backends.legacy → setuptools.build_meta
091c896
verified

nmstech commited on

Update model card with Use This Model section
47e9fd4
verified

nmstech commited on

Add AutoTokenizer support (trust_remote_code)
864ffd2
verified

nmstech commited on

Add AutoTokenizer support (trust_remote_code)
be2f46e
verified

nmstech commited on

Add AutoTokenizer support (trust_remote_code)
fffa764
verified

nmstech commited on

Initial release: TurkTokenizer v1.0.0 — TR-MMLU 92%
a0e8f24
verified

nmstech commited on

initial commit
40ce37f
verified

nmstech commited on