Azure Advisor Qwen2.5-0.5B (SFT)

Fine-tuned Qwen/Qwen2.5-0.5B-Instruct to generate Azure Advisor-style recommendations using Supervised Fine-Tuning (SFT).

Model Description

This model has been trained to analyze Azure workload configurations and generate structured recommendations across 5 Azure Advisor categories:

Cost - Cost optimization recommendations
Security - Security posture improvements
Performance - Performance optimization suggestions
OperationalExcellence - Operational best practices
HighAvailability - Reliability and availability improvements

Training Details

Parameter	Value
Base Model	Qwen/Qwen2.5-0.5B-Instruct
Method	SFT with LoRA adapters
Dataset	thegovind/azure-advisor-sft (348 train, 41 eval)
Training Steps	200
Learning Rate	2e-4 (cosine schedule)
LoRA Rank / Alpha	16 / 32
Quantization	4-bit QLoRA (NF4)
Hardware	NVIDIA RTX 3090 (24GB)
Training Time	~5 minutes

Training Metrics

Metric	Value
Pre-SFT Baseline	0.80/10
Post-SFT Score	3.72/10
Improvement	+2.92
Final Training Loss	0.029
Final Eval Loss	0.035

Loss Trajectory

1.76 -> 1.06 -> 0.47 -> 0.24 -> 0.13 -> 0.064 -> 0.050 -> 0.044 -> 0.038 -> 0.034 -> 0.029

Evaluation (5 Reward Functions, max 10.0)

Function	Weight	Description
Format Compliance	1.5	Correct XML tags and JSON structure
Category Correctness	2.0	Valid Advisor categories
Grounding Quality	2.0	Claims supported by input evidence
Actionability	2.0	Concrete, feasible next steps
Completeness	2.5	Coverage of issues with proper schema

Output Format

The model generates structured output with:

<ANALYSIS> - Reasoning about the workload state
<RECOMMENDATIONS> - JSON array of recommendation objects
<SUMMARY> - Brief summary of key recommendations

Each recommendation includes: category, impact, resourceId, problem, solution, potentialBenefits, evidence, nextSteps, confidence.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-0.5B-Instruct",
    torch_dtype=torch.float16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "thegovind/azure-advisor-qwen25-0.5b")
tokenizer = AutoTokenizer.from_pretrained("thegovind/azure-advisor-qwen25-0.5b")

messages = [
    {"role": "system", "content": "You are an Azure Advisor assistant..."},
    {"role": "user", "content": "Analyze this Azure workload and provide recommendations..."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

W&B Training Dashboard

SFT Run: wandb.ai/thegovind/azure-advisor-model/runs/quzg7fgs
Project: wandb.ai/thegovind/azure-advisor-model

Related Resources

GRPO Model: thegovind/azure-advisor-qwen25-0.5b-grpo - Further refined with reward-based GRPO training
SFT Dataset: thegovind/azure-advisor-sft - 410 training examples across 15 scenario types
GRPO Benchmark: thegovind/azure-advisor-grpo-benchmark - 106 evaluation examples with ground truth

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for thegovind/azure-advisor-qwen25-0.5b

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Adapter

(500)

this model

thegovind
/

azure-advisor-qwen25-0.5b