ArinUmut commited on
Commit
7482c16
·
verified ·
1 Parent(s): c349e58

Add model card metadata

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md CHANGED
@@ -1,3 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Pan-Turkic BPE Tokenizer
2
 
3
  A SentencePiece BPE tokenizer with 65,536 vocabulary size, purpose-built for the Turkic language family. Covers Latin, Cyrillic, and Arabic scripts used across Turkic languages.
 
1
+ ---
2
+ language:
3
+ - tr
4
+ - kk
5
+ - ky
6
+ - uz
7
+ - ug
8
+ - ba
9
+ - tt
10
+ - az
11
+ - crh
12
+ - tk
13
+ license: apache-2.0
14
+ tags:
15
+ - tokenizer
16
+ - sentencepiece
17
+ - bpe
18
+ - turkic
19
+ - multilingual
20
+ library_name: transformers
21
+ ---
22
+
23
  # Pan-Turkic BPE Tokenizer
24
 
25
  A SentencePiece BPE tokenizer with 65,536 vocabulary size, purpose-built for the Turkic language family. Covers Latin, Cyrillic, and Arabic scripts used across Turkic languages.