Qwen2.5-MM-1.5B-v1.1-Pretrained (ရွှေယုန် v1.1)

This model is a refined version of Qwen2.5-1.5B-Instruct, specifically pre-trained on a high-quality, manually curated Myanmar (Burmese) dataset to improve language understanding and generation capabilities.

🌟 Model Highlights

Version: 1.1 (Incremental Improvement)
Focus: Enhancing Burmese linguistic structures while maintaining ethical alignment.
Training Method: 4-bit LoRA (Low-Rank Adaptation) using the Unsloth framework for efficient learning.
Data Philosophy: Curated to avoid toxic content, biased misinformation, and offensive language.

📊 Training Results (10,000 Steps)

The model was trained for 10,000 steps with the following metrics:

Final Training Loss: 5.3031
Average Training Loss: 6.4400
Samples processed: 302,081
Training Time: ~7 hours 34 minutes

🛠️ Training Specifications

Hardware: 1x GPU (Num GPUs used = 1)
Batch Size: 4 (per device)
Gradient Accumulation Steps: 8
Total Batch Size: 32
Optimizer: AdamW (Unsloth default)
Learning Rate: Gradually decayed to 0.0

🛡️ Ethical Considerations

This model has been trained on datasets specifically filtered to remove:

Political Bias: Minimizing influence from one-sided news sources.
Harmful Content: Removing toxic, adult, and hate-speech content found in common web crawls.
Information Purity: Focusing on formal prose and structured Myanmar language.

🚀 Usage (Inference)

You can use this model with the Unsloth library or standard Transformers:

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "URajinda/Qwen2.5-MM-1.5B-v1.1-Pretrained",
    max_seq_length = 2048,
    load_in_4bit = True,
)

Downloads last month: 1

Safetensors

Model size

2B params

Tensor type

F16

Model tree for URajinda/Qwen2.5-MM-1.5B-v1.1-Pretrained

Base model

URajinda/Qwen2.5-MM-1.5B-Base

Finetuned

(4)

this model