Qwen3 models (123M/300M/600M) trained from scratch on 2.47B kk+ru tokens. Includes tokenizer, datasets, and checkpoints.
-
stukenov/ekitil-core-qwen3-123m-kkru-base-v1
Text Generation • 0.1B • Updated • 405 -
stukenov/ekitil-core-qwen3-300m-kkru-base-v1
Text Generation • 0.2B • Updated • 261 -
stukenov/ekitil-core-qwen3-600m-kkru-base-v1
Text Generation • 0.7B • Updated • 49 • 1 -
stukenov/ekitil-vocab-bpe-64k-kkru-v1
Updated