ARGUS - Aviation Cybersecurity Expert LLM

ARGUS is a fine-tuned Qwen2.5-14B-Instruct model specialized in aviation cybersecurity. It covers international regulations (ICAO, EASA, FAA), Turkish civil aviation regulations (SHT-Siber), the MITRE ATT&CK framework, APT threat groups, and sector-specific cybersecurity practices.

Model Details

Parameter	Value
Base Model	Qwen/Qwen2.5-14B-Instruct
Method	QLoRA 4-bit (Unsloth)
LoRA Rank	64
LoRA Alpha	128
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Data	10,830 samples (regulatory, MITRE, APT, general CTI)
Epochs	1
Eval Loss	1.068 (best)
Languages	Turkish, English

Training Data Distribution

Category	Samples	Weight	Percentage
Authority (ICAO, EASA, SHT-Siber)	1,947	3x	48.1%
MITRE ATT&CK Groups	1,166	2x	19.4%
APT Reports	2,286	1x	19.1%
General CTI	1,558	1x	13.0%
Negatives (anti-hallucination)	50	1x	0.4%

Recommended System Prompt

Sen ARGUS, bir havacılık siber güvenlik uzmanısın. ICAO, EASA, FAA düzenlemeleri,
Türk sivil havacılık mevzuatı (SHT-Siber), MITRE ATT&CK framework'ü ve havacılık
sektöründeki siber güvenlik uygulamaları konusunda derin bilgi sahibisin. Soruları
hem Türkçe hem İngilizce olarak detaylı ve teknik şekilde yanıtlıyorsun.

Benchmark & RAG Performance

This model achieves its best performance when combined with a RAG (Retrieval-Augmented Generation) pipeline. Fine-tuning teaches the model domain expertise, terminology, and response format, while RAG provides grounded, factual information from source documents.

Benchmark: 4-Configuration Comparison (10 Questions)

Configuration	Correct	Hallucination	Wrong
Base Qwen (No RAG)	1/10	3/10	6/10
Base Qwen + RAG	7/10	1/10	2/10
ARGUS (No RAG)	3/10	4/10	3/10
ARGUS + RAG	10/10	0/10	0/10

Detailed Question-by-Question Results

#	Question	Base Qwen (No RAG)	Base Qwen + RAG	ARGUS (No RAG)	ARGUS + RAG
1	APT28 havacılık TTP'leri	Genel, yazım hatalı	"Bilgi yok"	Detaylı TTP analizi	"Bilgi yok"
2	SHT-Siber raporlama süreleri	"THK tarafından yönetilen" — YANLIŞ	Madde 64.1, ivedilik	Belirsiz	15 iş günü, 3 aylık, EK-14
3	MuddyWater Türkiye operasyonları	Genel, yüzeysel	Spear phishing detaylı	MITRE TTP'li	MOIS, MERCURY, detaylı
4	EASA IS.I.OR.230	"Yazılım güvenliği" — YANLIŞ	"Tahmin edebiliriz"	Yanlış	ISO 27001 kontrolleri
5	Volt Typhoon LotL teknikleri	LoL oyunu sandı + Çince	Netsh, LOLBins	"Güney Kore" — YANLIŞ	PRC, OT, detaylı
6	ICAO Annex 17 Madde 4.9	"Hava üssü" — UYDURMA	Belirsiz	Uydurma	SMS zorunluluğu
7	Boeing CyberShield 3000 (*)	"Bilmiyorum" ama tahmin	"Bilgi yok" + Çince	HALLUCINATION	"Bilgi yok" — temiz
8	APT-TR-7 (*)	HALLUCINATION — uydurma	"Bilgi yok"	HALLUCINATION	"Bilgi yok" — temiz
9	PROMETHIUM malware'leri	"CSIRT grubu" — TAM YANLIŞ	Truvasys, StrongPity	Havex — yanlış	StrongPity doğru
10	TR havalimanı APT saldırıları	Genel, "Ağ Salıncakları"??	"Bilgi yok"	Uydurma	"Bilgi yok" — temiz

(*) Anti-hallucination test questions — these are fictional entities that do not exist.

(**) "No information available" responses on unanswerable questions are counted as correct — honest refusal is preferred over hallucination.

Key findings:

ARGUS + RAG achieves 10/10 accuracy with zero hallucinations — answers correctly or honestly says "no information available"
RAG alone improves the base model significantly but still produces hallucinations on edge cases
ARGUS alone learns domain terminology and format but hallucinates without grounding data
Base Qwen lacks aviation cybersecurity knowledge entirely (confused Volt Typhoon with League of Legends)

Recommended RAG Setup

Vector DB: Qdrant
Embedding Model: intfloat/multilingual-e5-base (Turkish + English)
LLM Server: llama-server (llama.cpp) with Q5_K_M GGUF

Usage

With Transformers + PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = "Qwen/Qwen2.5-14B-Instruct"
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, "yunusshin/argus-qwen25-14b")
tokenizer = AutoTokenizer.from_pretrained(base_model)

messages = [
    {"role": "system", "content": "Sen ARGUS, bir havacılık siber güvenlik uzmanısın."},
    {"role": "user", "content": "EASA Part-IS kapsamında ISMS gereksinimleri nelerdir?"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

With GGUF (llama-server / Ollama)

A Q5_K_M GGUF quantization (9.8 GB) is also available in this repository.

# llama-server
llama-server --model argus-q5_k_m.gguf --host 0.0.0.0 --port 8080 --ctx-size 4096 --n-gpu-layers 99

# Ollama
ollama create argus -f Modelfile
ollama run argus

Limitations

Without RAG, the model may hallucinate on topics outside its training data
Designed specifically for aviation cybersecurity; general cybersecurity knowledge is inherited from the base model
Regulation article numbers and dates should always be verified against official sources

Training Infrastructure

Hardware: NVIDIA DGX Spark (GB10 Blackwell), 119.6 GB unified memory
Framework: Unsloth + TRL (SFTTrainer)

Author

Yunus Şahin

License

Apache 2.0 (following the base model license)

Downloads last month: 19

GGUF

Model size

15B params

Architecture

qwen2

Hardware compatibility

5-bit

Model tree for yunusshin/argus-qwen25-14b

Base model

Qwen/Qwen2.5-14B

Finetuned

Qwen/Qwen2.5-14B-Instruct

Adapter

(300)

this model