Qwen3 Technical Report
Paper • 2505.09388 • Published • 339
Инструктивная модель на основе Qwen/Qwen3-4B, обученная на русскоязычном датасете GrandMaster2. Создана для высокоэффективной обработки текстов на русском и английском языках, обеспечивая точные ответы и быстрое выполнение задач.
Instructive model based on Qwen/Qwen3-4B, trained on the Russian-language dataset GrandMaster2. Designed for high-efficiency text processing in Russian and English, delivering precise responses and fast task execution.
@inproceedings{nikolich2024vikhr,
title={Vikhr: Advancing Open-Source Bilingual Instruction-Following Large Language Models for Russian and English},
author={Aleksandr Nikolich and Konstantin Korolev and Sergei Bratchikov and Nikolay Kompanets and Igor Kiselev and Artem Shelmanov},
booktitle={Proceedings of the 4th Workshop on Multilingual Representation Learning (MRL) @ EMNLP-2024},
year={2024},
publisher={Association for Computational Linguistics},
url={[https://arxiv.org/pdf/2405.13929](https://arxiv.org/pdf/2405.13929)}
}
@misc{qwen3technicalreport,
title={Qwen3 Technical Report},
author={Qwen Team},
year={2025},
eprint={2505.09388},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.09388},
}
1-bit
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit