GoldenNet-Qwen2.5-0.5B-QLoRA-v1
Model Description
GoldenNet-Qwen2.5-0.5B-QLoRA-v1 is a fine-tuned version of Qwen/Qwen2.5-0.5B-Instruct specialized for Iraqi Government Correspondence Processing.
This model performs two key tasks:
- Document Classification - Classifies government correspondence into 8 categories
- Named Entity Recognition - Extracts entities like persons, organizations, locations, dates, monetary values, and legal references
Supported Categories (التصنيفات)
| Arabic | English | Description |
|---|---|---|
| طلب | Request | Formal requests for approval, resources, or actions |
| شكوى | Complaint | Grievances and complaints from citizens or departments |
| تقرير | Report | Status reports, statistics, and progress updates |
| إعلام | Notification | Official announcements and notifications |
| استفسار | Inquiry | Questions seeking information or clarification |
| دعوة | Invitation | Invitations to events, meetings, or conferences |
| تعميم | Circular | Directives and circulars from higher authorities |
| إحالة | Referral | Document referrals to other departments |
Training Details
Configuration
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-0.5B-Instruct |
| Fine-tuning Method | QLoRA (4-bit quantization + LoRA) |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| LoRA Dropout | 0.05 |
| Learning Rate | 2e-4 |
| Epochs | 3 |
| Batch Size | 2 (effective: 16 with gradient accumulation) |
| Max Sequence Length | 2048 |
| Precision | BF16 |
Training Results
| Metric | Value |
|---|---|
| Training Loss | 0.448 |
| Evaluation Loss | 0.2998 |
| Training Time | ~49 seconds |
| Hardware | NVIDIA RTX 5070 (8GB VRAM) |
Loss Progression
- Epoch 1: 0.912
- Epoch 2: 0.319
- Epoch 3: 0.200
Usage
With Transformers (Python)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Alamori/GoldenNet-Qwen2.5-0.5B-QLoRA-v1",
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"Alamori/GoldenNet-Qwen2.5-0.5B-QLoRA-v1"
)
# Example: Classification
correspondence = """جمهورية العراق
وزارة التربية
مديرية تربية بغداد
العدد: 4521/ت/2025
التاريخ: 2025/05/15
إلى/ السيد مدير عام التعليم العام المحترم
م/ طلب تعيين معلمين
تحية طيبة...
نرجو الموافقة على تعيين 50 معلماً في المدارس الابتدائية.
مع التقدير
مدير التربية"""
instruction = "صنّف المراسلة الحكومية التالية إلى إحدى الفئات: طلب، شكوى، تقرير، إعلام، استفسار، دعوة، تعميم، إحالة. أجب بصيغة JSON تتضمن الفئة ودرجة الثقة والتبرير."
messages = [
{"role": "user", "content": f"{instruction}\n\n{correspondence}"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
# Output: {"category": "طلب", "confidence": 0.96, "reasoning": "المراسلة تطلب تعيين موظفين..."}
With Ollama
# Create the model
ollama create goldennet-iraqi-gov -f Modelfile
# Run inference
ollama run goldennet-iraqi-gov
With vLLM
vllm serve Alamori/GoldenNet-Qwen2.5-0.5B-QLoRA-v1 --port 8000
Example Outputs
Classification Task
Input:
صنّف المراسلة الحكومية التالية...
[تعميم من مجلس الوزراء بشأن الدوام الرسمي]
Output:
{"category": "تعميم", "confidence": 0.97, "reasoning": "المراسلة تعميم قانوني يتضمن توجيهات إلزامية"}
Entity Extraction Task
Input:
استخرج جميع الكيانات المسماة من المراسلة الحكومية التالية...
[تقرير صحي من دائرة صحة البصرة]
Output:
{
"persons": ["السيد وزير الصحة", "د. سعاد الموسوي"],
"organizations": ["وزارة الصحة", "دائرة صحة البصرة"],
"locations": ["محافظة البصرة", "الزبير", "الفاو"],
"dates": ["2025/06/10"],
"reference_numbers": ["7823/ص/2025"],
"monetary_values": ["5 مليار دينار"],
"quantities": ["3 مراكز صحية", "120 طبيباً"],
"projects": [],
"laws_regulations": []
}
Limitations
- Optimized specifically for Iraqi government correspondence format
- Best performance on formal Arabic administrative documents
- May require adaptation for other Arabic dialects or document types
- Recommended max input length: 2048 tokens
Intended Use
- Government document processing and automation
- Administrative workflow optimization
- Document routing and prioritization
- Metadata extraction from official correspondence
- Research on Arabic NLP for government applications
Ethical Considerations
This model is designed for legitimate government administrative purposes. Users should:
- Ensure compliance with data privacy regulations
- Use appropriate access controls for sensitive documents
- Validate model outputs before making critical decisions
- Not use for surveillance or unauthorized data collection
Citation
@misc{goldennet-qwen-qlora-v1,
author = {Golden Net AI},
title = {GoldenNet-Qwen2.5-0.5B-QLoRA-v1: Iraqi Government Correspondence Classifier},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/Alamori/GoldenNet-Qwen2.5-0.5B-QLoRA-v1}
}
About Golden Net AI
Golden Net AI is dedicated to developing AI solutions for Arabic language processing, with a focus on government and enterprise applications in Iraq and the MENA region.
License
This model is released under the Apache 2.0 License.
Developed by Golden Net AI
Empowering Iraqi Government Digital Transformation
Empowering Iraqi Government Digital Transformation
- Downloads last month
- 3
Model tree for Alamori/GoldenNet-Qwen2.5-0.5B-QLoRA-v1
Evaluation results
- Eval Lossself-reported0.300