TMF921 Intent-to-Configuration Translation — Training Pipeline

The first open-source fine-tuning pipeline for translating natural language network intents into spec-compliant 5G/6G configurations across 6 telecom standards.

What This Does

Fine-tunes an LLM (default: Qwen3-8B) via 4-bit QLoRA on the TMF921-intent-to-config-augmented dataset (41.8K samples) to translate natural language network intents into structured JSON configurations for:

Target Standard	Description
TMF921	TM Forum Intent Management API v4
3GPP TS 28.312	Intent-Driven Management (Rel-18)
CAMARA	GSMA Open Gateway NetworkSliceBooking
ETSI ZSM	Zero-touch Service Management intents
O-RAN A1 Policy	Non-RT RIC A1 policy objects
3GPP O1 NRM	O1 interface NR Network Resource Model

Plus TMF921 lifecycle operations (activate, modify, suspend, resume, terminate, scale, monitor, report) and adversarial rejection (ambiguous, out-of-scope, contradictory intents).

Quick Start

1. Clone & Install

git clone https://huggingface.co/nraptisss/intent-translation-training
cd intent-translation-training
pip install -r requirements.txt

2. Train (defaults — RTX 6000 Ada 50GB)

python train.py

This will:

Load Qwen3-8B in 4-bit NF4 quantization
Apply QLoRA (r=32, alpha=64) to all linear layers
Train for 3 epochs with cosine LR schedule
Save checkpoints and best model to ./output/

3. Evaluate

python evaluate.py --adapter_path ./output --num_samples 200

4. One-command pipeline

chmod +x run.sh
./run.sh

Training Configuration

Parameter	Default	Notes
Base model	`Qwen/Qwen3-8B`	Any HF causal LM works
Quantization	4-bit NF4 + double quant	~4.5 GB VRAM for weights
LoRA rank	32	target_modules="all-linear"
LoRA alpha	64	2× rank
Batch size	4 × 8 grad_accum = 32 effective
Learning rate	1e-4	10× base for LoRA
Epochs	3	~3,700 steps total
Max length	4096 tokens	Covers 99%+ of samples
Loss	assistant_only_loss=True	Train only on config outputs
Precision	bf16
Flash attention	Yes (flash_attention_2)
Gradient checkpointing	Yes	Saves ~40% VRAM

VRAM Estimate (RTX 6000 Ada 50GB)

Component	VRAM
Model weights (4-bit)	~4.5 GB
LoRA adapters	~0.5 GB
Activations + gradients (checkpointed)	~12 GB
Optimizer states (AdamW, LoRA params only)	~1 GB
KV cache / batch	~8 GB
Total estimated	~26 GB

Leaves ~24 GB headroom. You can increase batch_size to 8 if desired.

Customization

Different base model

python train.py --base_model Qwen/Qwen2.5-7B-Instruct
python train.py --base_model meta-llama/Llama-3.1-8B-Instruct
python train.py --base_model microsoft/phi-4

Push to Hugging Face Hub

export HF_TOKEN="hf_..."
python train.py --push_to_hub --hub_model_id your-username/model-name

Adjust hyperparameters

python train.py --lr 2e-4 --epochs 5 --lora_r 64 --lora_alpha 128 --batch_size 2 --grad_accum 16

No flash attention (older GPUs)

python train.py --no_flash_attn

Evaluation Metrics

The evaluation script measures:

Metric	Description
JSON Validity Rate	% of outputs that are valid JSON
Structure Correctness	% with correct root keys per standard
KPI Field Accuracy	% containing correct latency, throughput, reliability, UE values
All KPIs Correct	% where ALL 5 KPI fields match
Adversarial Accuracy	% of bad intents correctly rejected

Results are broken down per target layer (TMF921, 3GPP, CAMARA, etc.).

File Structure

├── train.py           # QLoRA fine-tuning script
├── evaluate.py        # Evaluation with per-layer metrics
├── run.sh             # One-command train+eval pipeline
├── requirements.txt   # Python dependencies
└── README.md          # This file

Dataset

nraptisss/TMF921-intent-to-config-augmented — 41,815 samples:

39,153 standard intent→config pairs across 6 target layers
1,552 lifecycle operation samples (8 TMF921 operations)
141 adversarial samples (ambiguous, out-of-scope, contradictory)
18 sectors, 147 use cases, 55 regions
6 slice types: eMBB, URLLC, mMTC, V2X, MPS, HMTC

References

License

Apache 2.0