ruGPT-3 XL (HuggingFace format) GGUF
A 1.3B-parameter GPT-3-style language model for Russian, converted from the original
ai-forever/rugpt3xl Megatron-LM checkpoint
into a native HuggingFace transformers format.
This is a base (pretrained) model, not instruction-tuned. It performs text completion and can be fine-tuned for downstream tasks.
Details in "A family of pretrained transformer language models for Russian" paper.
Model Details
| Parameter | Value |
|---|---|
| Parameters | 1.3B |
| Architecture | GPT-3 (decoder-only transformer) |
| Hidden size | 2048 |
| Layers | 24 |
| Attention heads | 16 |
| FFN intermediate size | 8192 |
| Max sequence length | 2048 |
| Vocabulary | 50,264 tokens (BPE) |
| Activation | GELU |
| Normalization | Pre-LayerNorm |
| Position encoding | Learned absolute |
| Precision | float16 |
| Training data | 80B tokens of Russian text (4 epochs) |
| Test perplexity | 12.05 |
Quick Start
Example with Q4_K_M:
./llama.cpp/build/bin/llama-cli \
-m ./ruGPT3XL-GGUF/ruGPT3XL-q4_k_m.gguf \
-c 2048 \
-p "Москва - столица" \
-n 128 \
--temp 0.7 \
--top-p 0.9 \
--repeat-penalty 1.2
Notes:
- Use
-c 2048for the native context length. - Prefer
ruGPT3XL-q4_k_m.gguforruGPT3XL-q8_0.gguffor CPU inference. - Use
ruGPT3XL-f16.ggufmainly for GPU.
Start server:
./llama.cpp/build/bin/llama-server \
-m ./ruGPT3XL-GGUF/ruGPT3XL-q4_k_m.gguf \
-c 2048 \
--host 127.0.0.1 \
--port 8080
Example request:
curl http://127.0.0.1:8080/completion \
-H "Content-Type: application/json" \
-d '{
"prompt": "Вопрос: Какая столица России?\n\nОтвет: ",
"n_predict": 128,
"temperature": 0.7,
"top_p": 0.9,
"repeat_penalty": 1.2
}'
Limitations
- This is a base model trained on Russian internet text. It may generate biased, factually incorrect, or offensive content.
- The model was trained primarily on Russian text. It has limited capability in other languages.
- Maximum context length is 2048 tokens. Inputs longer than this will be truncated.
- The model is not instruction-tuned and works best for text completion rather than following specific instructions.
Citation
@misc{rugpt3xl-gguf,
title={ruGPT3XL-GGUF},
author={Pavel Rykov},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/evilfreelancer/ruGPT3XL-GGUF}
}
Links
- A family of pretrained transformer language models for Russian - paper on Google Scholar
- ai-forever/rugpt3xl - original model
- ai-forever/ru-gpts - original training codebase
- Downloads last month
- 383
Hardware compatibility
Log In to add your hardware
4-bit
8-bit
16-bit