Qwen3-8B-TMF921-Intent-QLora
A QLoRA fine-tuned Qwen3-8B that achieves 100% in-distribution schema compliance across 6 telecom standards, 8 lifecycle operations, and adversarial rejection — evaluated on 498 stratified test samples covering all 17 target layer types.
This model converts free-form operator intents like "Deploy a URLLC slice for remote surgery with sub-1ms latency and 99.999% reliability" into structured JSON configurations conforming to TMF921, 3GPP TS 28.312, ETSI ZSM 009-1, CAMARA, O-RAN A1, and 3GPP TS 28.541.
Key Results
| Metric | Score | Samples |
|---|---|---|
| JSON Validity | 100.0% | 498/498 |
| Structure Correctness | 100.0% | 498/498 |
| All 5 KPIs Present | 100.0% | 300/300 (create ops) |
| Adversarial Rejection | 100.0% | 25/25 |
Every single output across all 17 target layer types — including 6 specification layers, 8 lifecycle operations, and 3 adversarial categories — was valid JSON with correct structure and complete KPI fields.
Evaluation scope note: These results measure in-distribution schema compliance — the test set was generated by the same pipeline as the training data. The 100% score confirms the model has learned to produce correctly formatted, spec-compliant JSON for the template-bounded input distribution. This is analogous to SQL generation models achieving near-perfect scores on Spider test splits while requiring additional evaluation on user-generated queries. Out-of-distribution evaluation on human-written operator intents (e.g., the ORION 100-intent benchmark) is needed to assess real-world generalization and is planned as future work.
Evaluation Results
Per-Layer Breakdown (498 stratified samples, 50 per major layer)
| Layer | N | JSON Valid | Struct Correct | All KPIs |
|---|---|---|---|---|
tmf921 |
50 | 100% | 100% | 100% |
intent_3gpp |
50 | 100% | 100% | 100% |
camara |
50 | 100% | 100% | 100% |
a1_policy |
50 | 100% | 100% | 100% |
o1_nrm |
50 | 100% | 100% | 100% |
etsi_zsm |
50 | 100% | 100% | 100% |
tmf921_lifecycle_activate |
19 | 100% | 100% | — |
tmf921_lifecycle_modify |
20 | 100% | 100% | — |
tmf921_lifecycle_monitor |
30 | 100% | 100% | — |
tmf921_lifecycle_report |
18 | 100% | 100% | — |
tmf921_lifecycle_resume |
24 | 100% | 100% | — |
tmf921_lifecycle_scale |
21 | 100% | 100% | — |
tmf921_lifecycle_suspend |
15 | 100% | 100% | — |
tmf921_lifecycle_terminate |
26 | 100% | 100% | — |
adversarial_ambiguous |
10 | 100% | 100% | — |
adversarial_contradictory |
8 | 100% | 100% | — |
adversarial_out_of_scope |
7 | 100% | 100% | — |
KPI Field Presence (create operations, n=300)
| Layer | N | Latency | Reliability | DL Thpt | UL Thpt | Max UEs |
|---|---|---|---|---|---|---|
tmf921 |
50 | 100% | 100% | 100% | 100% | 100% |
intent_3gpp |
50 | 100% | 100% | 100% | 100% | 100% |
camara |
50 | 100% | 100% | 100% | 100% | 100% |
a1_policy |
50 | 100% | 100% | 100% | 100% | 100% |
o1_nrm |
50 | 100% | 100% | 100% | 100% | 100% |
etsi_zsm |
50 | 100% | 100% | 100% | 100% | 100% |
Evaluation Methodology
The evaluation used stratified sampling (50 samples per major layer, all samples for smaller layers) with standard-aware KPI checking that correctly handles how each telecom standard encodes network parameters:
- TMF921, 3GPP, CAMARA, ETSI ZSM: Direct KPI value matching with int/float tolerance
- O-RAN A1 Policy: Reliability → Packet Error Rate (PER), latency → Packet Delay Budget (pdb), throughput → GFBR/MFBR
- O1 NRM (TS 28.541): Structural element presence (
rrmPolicyMemberList,operationalState,arfcnDL,bSChannelBwDL)
Full evaluation results are available in eval_v3_results.json.
Model Details
| Property | Value |
|---|---|
| Base model | Qwen/Qwen3-8B |
| Method | QLoRA (4-bit NF4 quantization + LoRA adapters) |
| Training dataset | nraptisss/TMF921-intent-to-config-augmented (41,815 samples) |
| Task | Natural language → structured JSON configuration translation |
| Standards covered | TMF921, 3GPP TS 28.312, ETSI ZSM 009-1, CAMARA, O-RAN A1, 3GPP TS 28.541 |
| License | Apache 2.0 |
Training Configuration
| Parameter | Value |
|---|---|
| Quantization | 4-bit NF4 + double quantization |
| LoRA rank (r) | 32 |
| LoRA alpha | 64 |
| Target modules | all-linear |
| Effective batch size | 32 (per_device=4 × grad_accum=8) |
| Learning rate | 1e-4 (cosine schedule, warmup_ratio=0.1) |
| Epochs | 3 |
| Max sequence length | 4,096 tokens |
| Loss masking | assistant_only_loss=True |
| Precision | bf16 |
| Flash attention | flash_attention_2 |
| Gradient checkpointing | Yes |
| Estimated VRAM | ~26 GB |
Usage
Inference
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import json
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B", torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(base_model, "nraptisss/Qwen3-8B-TMF921-Intent-QLora")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
messages = [
{"role": "system", "content": "You are a TM Forum TMF921-compliant Intent Management system. Given a natural language network intent, output a valid TMF921 Intent Management API v4 JSON object. The response must include the full Intent resource with id, href, name, description, lifecycleState, priority, intentExpression containing IntentExpectation objects with DeliveryExpectation targets, contexts, and relatedParty references. Follow the TMF921 Open API specification and use @type annotations for polymorphic types. Ground all KPI targets in 3GPP TS 22.261 performance requirements."},
{"role": "user", "content": "Deploy a URLLC slice for remote robotic surgery in Hospital Campus. Requirements: sub-1 ms latency, 99.999% reliability, 100 Mbps DL, 50 Mbps UL, 50 connected devices."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=4096, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
config = json.loads(response)
print(json.dumps(config, indent=2))
System Prompts by Target Layer
| Layer | System Prompt Summary |
|---|---|
tmf921 |
TMF921 v4 JSON with @type annotations, HATEOAS links, relatedParty, geographic/temporal contexts |
intent_3gpp |
3GPP TS 28.312 v18.8.0 intent with intentExpectation, S-NSSAI, targets |
camara |
CAMARA NetworkSliceBooking with sliceProfile, areaOfService, duration |
a1_policy |
O-RAN A1 policy with 5QI mapping, PRB quotas, scheduler weights |
o1_nrm |
3GPP TS 28.541 ManagedElement → GNBDUFunction → NRCellDU with RRM policies |
etsi_zsm |
ETSI ZSM 009-1 intent with objectives, constraints, context, fulfillmentRequirements |
Full system prompts for each layer are in the training dataset.
Supported Target Layers
6 Specification Layers
| Layer | Standard | Description |
|---|---|---|
tmf921 |
TM Forum TMF921 v4 | Full Intent resource with @type annotations, HATEOAS links |
intent_3gpp |
3GPP TS 28.312 Rel-18 | Intent with intentExpectation and S-NSSAI encoding |
camara |
CAMARA NetworkSliceBooking | Slice booking with QoS profile and area of service |
a1_policy |
O-RAN WG2 A1 | Policy with 5QI, PRB quotas, scheduler weights |
o1_nrm |
3GPP TS 28.541 | RAN config: ManagedElement → GNBDUFunction → NRCellDU |
etsi_zsm |
ETSI GS ZSM 009-1 | Zero-touch intent with fulfillment requirements |
8 Lifecycle Operations
activate, modify, monitor, report, scale, suspend, resume, terminate — all following the TMF921 state machine.
3 Adversarial Categories
| Category | Example Input | Expected Output |
|---|---|---|
| Ambiguous | "Can you create a slice for educational purposes?" | CLARIFICATION_REQUIRED |
| Contradictory | "mMTC slice with sub-1ms latency for 1M devices at 10 Gbps each" | INTENT_VALIDATION_FAILED |
| Out-of-scope | "Request for a list of local volunteer opportunities" | OUT_OF_SCOPE |
6 Slice Types
eMBB (SST=1), URLLC (SST=2), mMTC (SST=3), V2X (SST=4), HMTC (SST=5), MPS (SST=5, distinct SD)
Training Data
nraptisss/TMF921-intent-to-config-augmented — 41,815 samples (39,294 train / 2,521 test) covering 6 specification layers, 8 lifecycle operations, 3 adversarial categories, 18 industry sectors, 147 use cases, and 54 geographic regions.
See the dataset card for full documentation of construction methodology and specification grounding.
Training Pipeline
Reproducible pipeline at nraptisss/intent-translation-training with train.py, evaluate_v3.py, and run.sh.
Known Limitations
- In-distribution evaluation only: The 100% results are on a test set from the same synthetic generator. Real operator intents are more ambiguous, underspecified, and linguistically diverse. Out-of-distribution evaluation on human-authored intents is required before deployment claims.
- Single-turn only: Handles individual intent→config translations. Multi-turn conversations (create → monitor → modify → terminate) were not evaluated.
- English only: All training data is in English.
- Synthetic training data: Fine-tuned on template + LLM-augmented data, not real operator intents. Real-world intents may be more ambiguous.
- HMTC is speculative: SST=5 for HMTC is based on Rel-17 extensions; 6G slice types are not yet standardized.
Citation
@misc{qwen3_8b_tmf921_qlora_2025,
title = {Qwen3-8B-TMF921-Intent-QLora: QLoRA Fine-tuned LLM for
5G/6G Intent-to-Configuration Translation},
author = {Raptis, Nikolaos},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/nraptisss/Qwen3-8B-TMF921-Intent-QLora},
note = {100\% JSON validity, structure correctness, and KPI accuracy
on 498 stratified test samples across TMF921, 3GPP TS 28.312,
ETSI ZSM, CAMARA, O-RAN A1, and 3GPP TS 28.541}
}
References
- Downloads last month
- -
Model tree for nraptisss/Qwen3-8B-TMF921-Intent-QLora
Dataset used to train nraptisss/Qwen3-8B-TMF921-Intent-QLora
Papers for nraptisss/Qwen3-8B-TMF921-Intent-QLora
NEFMind: Parameter-Efficient Fine-Tuning of Open-Source LLMs for Telecom APIs Automation
ORANSight-2.0: Foundational LLMs for O-RAN
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models
QLoRA: Efficient Finetuning of Quantized LLMs
Evaluation results
- JSON Validity (%) on TMF921-intent-to-config-augmentedtest set self-reported100.000
- Structure Correctness (%) on TMF921-intent-to-config-augmentedtest set self-reported100.000
- All KPIs Present (%) on TMF921-intent-to-config-augmentedtest set self-reported100.000