Upload README.md

4d20f09 verified 12 days ago

5.11 kB

	# TMF921 Intent-to-Configuration Translation — Training Pipeline

	> The first open-source fine-tuning pipeline for translating natural language network intents into spec-compliant 5G/6G configurations across 6 telecom standards.

	## What This Does

	Fine-tunes an LLM (default: Qwen3-8B) via 4-bit QLoRA on the [TMF921-intent-to-config-augmented](https://huggingface.co/datasets/nraptisss/TMF921-intent-to-config-augmented) dataset (41.8K samples) to translate natural language network intents into structured JSON configurations for:

	\| Target Standard \| Description \|
	\|---\|---\|
	\| TMF921 \| TM Forum Intent Management API v4 \|
	\| 3GPP TS 28.312 \| Intent-Driven Management (Rel-18) \|
	\| CAMARA \| GSMA Open Gateway NetworkSliceBooking \|
	\| ETSI ZSM \| Zero-touch Service Management intents \|
	\| O-RAN A1 Policy \| Non-RT RIC A1 policy objects \|
	\| 3GPP O1 NRM \| O1 interface NR Network Resource Model \|

	Plus TMF921 lifecycle operations (activate, modify, suspend, resume, terminate, scale, monitor, report) and adversarial rejection (ambiguous, out-of-scope, contradictory intents).

	## Quick Start

	### 1. Clone & Install

	```bash
	git clone https://huggingface.co/nraptisss/intent-translation-training
	cd intent-translation-training
	pip install -r requirements.txt
	```

	### 2. Train (defaults — RTX 6000 Ada 50GB)

	```bash
	python train.py
	```

	This will:
	- Load Qwen3-8B in 4-bit NF4 quantization
	- Apply QLoRA (r=32, alpha=64) to all linear layers
	- Train for 3 epochs with cosine LR schedule
	- Save checkpoints and best model to `./output/`

	### 3. Evaluate

	```bash
	python evaluate.py --adapter_path ./output --num_samples 200
	```

	### 4. One-command pipeline

	```bash
	chmod +x run.sh
	./run.sh
	```

	## Training Configuration

	\| Parameter \| Default \| Notes \|
	\|---\|---\|---\|
	\| Base model \| `Qwen/Qwen3-8B` \| Any HF causal LM works \|
	\| Quantization \| 4-bit NF4 + double quant \| ~4.5 GB VRAM for weights \|
	\| LoRA rank \| 32 \| target_modules="all-linear" \|
	\| LoRA alpha \| 64 \| 2× rank \|
	\| Batch size \| 4 × 8 grad_accum = 32 effective \| \|
	\| Learning rate \| 1e-4 \| 10× base for LoRA \|
	\| Epochs \| 3 \| ~3,700 steps total \|
	\| Max length \| 4096 tokens \| Covers 99%+ of samples \|
	\| Loss \| assistant_only_loss=True \| Train only on config outputs \|
	\| Precision \| bf16 \| \|
	\| Flash attention \| Yes (flash_attention_2) \| \|
	\| Gradient checkpointing \| Yes \| Saves ~40% VRAM \|

	### VRAM Estimate (RTX 6000 Ada 50GB)

	\| Component \| VRAM \|
	\|---\|---\|
	\| Model weights (4-bit) \| ~4.5 GB \|
	\| LoRA adapters \| ~0.5 GB \|
	\| Activations + gradients (checkpointed) \| ~12 GB \|
	\| Optimizer states (AdamW, LoRA params only) \| ~1 GB \|
	\| KV cache / batch \| ~8 GB \|
	\| Total estimated \| ~26 GB \|

	Leaves ~24 GB headroom. You can increase batch_size to 8 if desired.

	## Customization

	### Different base model
	```bash
	python train.py --base_model Qwen/Qwen2.5-7B-Instruct
	python train.py --base_model meta-llama/Llama-3.1-8B-Instruct
	python train.py --base_model microsoft/phi-4
	```

	### Push to Hugging Face Hub
	```bash
	export HF_TOKEN="hf_..."
	python train.py --push_to_hub --hub_model_id your-username/model-name
	```

	### Adjust hyperparameters
	```bash
	python train.py --lr 2e-4 --epochs 5 --lora_r 64 --lora_alpha 128 --batch_size 2 --grad_accum 16
	```

	### No flash attention (older GPUs)
	```bash
	python train.py --no_flash_attn
	```

	## Evaluation Metrics

	The evaluation script measures:

	\| Metric \| Description \|
	\|---\|---\|
	\| JSON Validity Rate \| % of outputs that are valid JSON \|
	\| Structure Correctness \| % with correct root keys per standard \|
	\| KPI Field Accuracy \| % containing correct latency, throughput, reliability, UE values \|
	\| All KPIs Correct \| % where ALL 5 KPI fields match \|
	\| Adversarial Accuracy \| % of bad intents correctly rejected \|

	Results are broken down per target layer (TMF921, 3GPP, CAMARA, etc.).

	## File Structure

	```
	├── train.py # QLoRA fine-tuning script
	├── evaluate.py # Evaluation with per-layer metrics
	├── run.sh # One-command train+eval pipeline
	├── requirements.txt # Python dependencies
	└── README.md # This file
	```

	## Dataset

	[nraptisss/TMF921-intent-to-config-augmented](https://huggingface.co/datasets/nraptisss/TMF921-intent-to-config-augmented) — 41,815 samples:

	- 39,153 standard intent→config pairs across 6 target layers
	- 1,552 lifecycle operation samples (8 TMF921 operations)
	- 141 adversarial samples (ambiguous, out-of-scope, contradictory)
	- 18 sectors, 147 use cases, 55 regions
	- 6 slice types: eMBB, URLLC, mMTC, V2X, MPS, HMTC

	## References

	- [ORION: Intent-Aware Orchestration in Open RAN](https://arxiv.org/abs/2603.03667)
	- [TelecomGPT](https://arxiv.org/abs/2407.09424)
	- [ORANSight-2.0](https://arxiv.org/abs/2503.05200)
	- [NEFMind](https://arxiv.org/abs/2508.09240)
	- [TMF921 Intent Management API v5.0](https://www.tmforum.org/resources/specification/tmf921-intent-management-api-rest-specification-r21-0-0/)
	- [3GPP TS 28.312 Rel-18](https://www.3gpp.org/dynareport/28312.htm)

	## License

	Apache 2.0