Azure Cloud Solution Architect - Qwen 3.5 0.8B (SFT Merged)
A fully merged Qwen 3.5 0.8B model fine-tuned via SFT to be an Azure Cloud Solution Architect. This is the LoRA adapters merged into the base weights for easy deployment — no adapter loading needed.
What This Model Does
- Answers questions about Azure architecture patterns and best practices
- Identifies Azure services in architecture diagrams
- Explains cloud design decisions (networking, compute, storage, security, etc.)
- Trained on 1,678 Q&A pairs scraped from Azure Architecture Center
Training Details
| Parameter |
Value |
| Base Model |
unsloth/Qwen3.5-0.8B |
| Method |
Supervised Fine-Tuning (SFT) with LoRA, then merged |
| LoRA Rank |
16 |
| Dataset |
thegovind/azure-architecture-vqa (1,678 train / 187 test) |
| Training Time |
42.6 minutes on RTX 4090 |
| Final Loss |
0.6517 |
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"thegovind/azure-architect-qwen35-0.8b-merged",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("thegovind/azure-architect-qwen35-0.8b-merged")
prompt = "What Azure service is best for global content delivery?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0], skip_special_tokens=True))
When to Use This vs. the LoRA Version
| Version |
Use When |
| This (merged) |
Deployment, inference servers, GGUF conversion, Foundry Local |
| LoRA |
Further fine-tuning, experimentation, saving storage |
Related Models & Resources