Vesna-R1-1.5B-Reasoning
Vesna-R1-1.5B-Reasoning is a fine-tuned reasoning-oriented adapter based on unsloth/deepseek-r1-distill-qwen-1.5b-bnb-4bit.
This repository contains a lightweight fine-tuned adapter trained with Unsloth, designed to improve reasoning-style responses, instruction following, and overall conversational coherence while preserving the efficiency of the original 1.5B base model.
Model Details
- Model name: Vesna-R1-1.5B-Reasoning
- Developed by: zumberisclown
- Base model: unsloth/deepseek-r1-distill-qwen-1.5b-bnb-4bit
- License: Apache-2.0
- Language(s): English
- Frameworks: Transformers, TRL, Unsloth
Description
This model is a fine-tuned adapter built on top of a distilled DeepSeek-R1 Qwen 1.5B variant.
It was trained with Unsloth, which enables faster and more memory-efficient fine-tuning.
The goal of this project is to enhance the base model’s performance on:
- reasoning-style generations
- instruction-following tasks
- conversational responses
- structured answer formatting
Training
- Base model:
unsloth/deepseek-r1-distill-qwen-1.5b-bnb-4bit - Training library: Unsloth
- Model type: Fine-tuned adapter
- Optimization goal: Efficient reasoning-focused instruction tuning
This model was trained using Unsloth, allowing significantly faster fine-tuning compared to standard approaches.
Intended Use
This model is intended for:
- general instruction following
- lightweight reasoning tasks
- experimentation with small reasoning-oriented language models
- research and hobbyist workflows
Limitations
As a 1.5B parameter class model, this adapter has important limitations:
- it may struggle with complex multi-step reasoning
- it is not guaranteed to be reliable for factual or high-stakes tasks
- performance may vary significantly outside the training distribution
- outputs should be reviewed before use in production or critical settings
Usage
Make sure to load the base model together with the adapter weights from this repository.
Acknowledgements
This project was trained with Unsloth, an excellent library for fast and memory-efficient LLM fine-tuning.
