You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

This model is gated for tracking purposes. Access is granted manually. Please provide a brief description of your intended use case.

SozKZ Core Llama 1B Kazakh Instruct v2

An instruction-tuned version of sozkz-core-llama-1b-kk-base-v1 fine-tuned on Kazakh instruction data using LoRA SFT.

Note: This is a v2 experimental release. For most use cases, the base model is recommended. The instruct version can follow simple instructions in Kazakh but quality is limited by the small fine-tuning dataset.

Model Details

Property	Value
Base model	stukenov/sozkz-core-llama-1b-kk-base-v1 (step-final)
Method	LoRA SFT (merged weights)
LoRA rank (r)	64
LoRA alpha	128
Learning rate	2e-4
Epochs	2
Batch size	16
Training time	~25 min on H100 SXM
Dataset	AmanMussa/kazakh-instruction-v2
Training examples	52,173

Architecture

Same architecture as the base model:

1.08B parameters, 2048 hidden, 22 layers, 16 heads, 4 KV heads (GQA)
50,257 vocab, 1024 max context, tied embeddings, bfloat16

Inference Examples

Examples from training evaluation:

Q: Қазақстанның астанасы қай қала? A: Астананың туған күні - 10 шілде.

Q: Абай Құнанбаев туралы қысқаша айтып бер A: Абай - ұлы ақын, ойшыл және композитор. Ол адамдарды білім мен білімге шақырды және адамдарға өмір бойы үйренуге мүмкіндік берді. Ол өзінің өлеңдерін «Заратуштралық», оның ішінде «Жартас» және «Түнгі түнде». Ол сонымен қатар ол өз заманына сай, оны өз заманының ең маңызды философтарының бірі етті.

Q: Жасанды интеллект дегеніміз не? A: AI (жасанды интеллект) - бұл машиналар, мысалы, адам сияқты тапсырмаларды орындау үшін пайдаланылатын алгоритмдер мен жүйелер. Ол шешімдер қабылдау және машиналарды оқыту арқылы адамдар сияқты шешім қабылдауға арналған.

Q: Денсаулықты сақтау үшін не істеу керек? A: Тиімді тамақтану, тұрақты жаттығулар және ұйқы жеткілікті.

Q: Каспий теңізі туралы айтып бер A: Каспий теңізінің аты - «теңіз» деген тіркесім, бұл теңіздің керемет тереңдігі мен кеңдігін көрсетеді. Бұл теңізге әсер етудің ең үлкен оқиғасы және әлемдегі ең ірі теңіздердің бірі болып саналады.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "stukenov/sozkz-core-llama-1b-kk-instruct-v2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16", device_map="auto")

prompt = "Q: Қазақстанның астанасы қай қала?\nA:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.2,
    do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

v2 experimental -- the base model is generally recommended for most downstream tasks.
Instruction following is inconsistent; the model sometimes ignores the question format.
May hallucinate facts -- do not use for factual reference.
Limited to 1024 context length.
Fine-tuned on a relatively small instruction dataset (52K examples).

License

MIT License with gated access (manual approval required for models 300M+).

Citation

@misc{sozkz-llama-1b-kk-instruct-v2-2026,
  title={SozKZ Core Llama 1B Kazakh Instruct v2},
  author={Saken Tukenov},
  year={2026},
  url={https://huggingface.co/stukenov/sozkz-core-llama-1b-kk-instruct-v2}
}

Downloads last month: 335

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for stukenov/sozkz-core-llama-1b-kk-instruct-v2

Base model

stukenov/sozkz-core-llama-1b-kk-base-v1

Adapter

(1)

this model

stukenov
/

sozkz-core-llama-1b-kk-instruct-v2