Upload README.md

a6306b2 verified 12 days ago

7.54 kB

	# Telecom Intent-to-Config Pipeline

	Complete, zero-cost pipeline for fine-tuning and evaluating LLMs that translate natural language network intents into 5G/6G standard-compliant configurations.

	## What's Inside

	\| File \| Purpose \|
	\|------\|---------\|
	\| `train.py` \| QLoRA fine-tuning on your intent dataset (Kaggle T4x2 optimized) \|
	\| `inference.py` \| Generate configs from natural language intents (interactive or batch) \|
	\| `merge_and_push.py` \| Merge LoRA adapters + push merged model to Hugging Face Hub \|
	\| `benchmark.py` \| Evaluate on test set: JSON validity, schema compliance, semantic similarity \|
	\| `kaggle_notebook.ipynb` \| Ready-to-run Kaggle notebook (download scripts → train → test → benchmark) \|
	\| `requirements.txt` \| Python dependencies \|

	## Supported Datasets (Your Own)

	- `nraptisss/TMF921-intent-to-config-augmented` (~35K, multi-layer)
	- `nraptisss/TMF921-intent-to-config-25k` (~25K, multi-layer)
	- `nraptisss/telecom-intent-config-sft-10k` (~10K, multi-layer)

	All datasets use ChatML `messages` format with system/user/assistant roles.

	## Supported Target Layers

	\| Layer \| Standard \| Output Format \|
	\|-------\|----------\|---------------\|
	\| `tmf921` \| TM Forum Intent Management API v5.0.0 \| JSON intent FVO \|
	\| `intent_3gpp` \| 3GPP TS 28.312 Rel-18 \| JSON intent payload \|
	\| `camara` \| CAMARA NetworkSliceBooking \| OpenAPI JSON \|
	\| `etsi_zsm` \| ETSI ZSM GS 009-1 \| JSON service profile \|
	\| `a1_policy` \| O-RAN WG2 A1 Interface \| JSON policy \|
	\| `o1_nrm` \| 3GPP TS 28.541 NR NRM \| YANG/XML-style JSON \|

	## Quick Start: Kaggle (Free T4x2 GPU)

	### Option A — Notebook (easiest, 5 minutes setup)

	1. Go to [Kaggle → New Notebook](https://www.kaggle.com/code)
	2. Add GPU T4 x2 accelerator (right panel)
	3. Paste the contents of `kaggle_notebook.ipynb` into the first cell
	4. Run all cells top to bottom

	### Option B — Terminal

	```bash
	pip install -q transformers trl peft accelerate bitsandbytes datasets sentence-transformers huggingface-hub

	wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/train.py
	wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/inference.py
	wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/merge_and_push.py
	wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/benchmark.py

	python train.py # 2-3 hours on T4x2
	python inference.py --intent "Deploy URLLC slice for factory automation with 1ms latency"
	python benchmark.py --max_samples 100
	python merge_and_push.py # pushes to your hub
	```

	## Hyperparameters (T4x2 Optimized)

	\| Parameter \| Value \| Rationale \|
	\|-----------\|-------\|-----------\|
	\| Model \| Qwen2.5-7B-Instruct \| Fits 16GB VRAM, strong reasoning \|
	\| Quantization \| 4-bit NF4 \| QLoRA, enables 7B on T4 \|
	\| LoRA rank \| 64 \| Balances capacity and memory \|
	\| LoRA alpha \| 16 \| Standard α = r/4 \|
	\| Batch size \| 1 per GPU \| Fits T4 VRAM \|
	\| Gradient accumulation \| 4 \| Effective batch = 4 \|
	\| Learning rate \| 2e-4 \| 10× base for LoRA (TRL recommendation) \|
	\| Max length \| 512 \| Covers most intent→config pairs \|
	\| Epochs \| 3 \| Sufficient for ~35K samples \|
	\| Liger kernel \| Disabled \| Crashes on T4 with gradient checkpointing \|
	\| FP16 \| True \| T4 has no bf16 support \|

	Note: `liger-kernel` is intentionally disabled by default (`use_liger_kernel=False`). It causes Triton crashes on T4 GPUs when combined with gradient checkpointing. Only enable it if training on A100 / L40 / H100.

	## Expected Results

	After 3 epochs on `TMF921-intent-to-config-augmented`:

	- Training time: ~2–3 hours on Kaggle T4x2
	- JSON validity: >90% on test set
	- Schema compliance: >85% (key presence check)
	- Model size: ~14GB merged (FP16), ~200MB LoRA adapters

	## Customization

	### Use a different dataset
	Edit `train.py`:
	```python
	DATASET_NAME = "your-username/your-dataset"
	DATASET_CONFIG = "default"
	TRAIN_SPLIT = "train"
	```

	### Use a different base model
	Edit `train.py`:
	```python
	MODEL_NAME = "meta-llama/Llama-3.1-8B-Instruct"
	```

	### Change LoRA parameters (if OOM)
	Edit `train.py`:
	```python
	LORA_R = 32 # lower = less memory, less capacity
	LORA_ALPHA = 8
	MAX_LENGTH = 256 # shorter sequences
	```

	### Push to your own hub
	Edit `train.py` and `merge_and_push.py`:
	```python
	HUB_MODEL_ID = "your-username/your-model-name"
	```

	### Enable Liger kernel (A100/H100 only)
	Edit `train.py`:
	```python
	use_liger_kernel=True # ONLY on A100 / L40 / H100. WILL crash on T4.
	```
	And install: `pip install liger-kernel>=0.5.0`

	## Architecture Diagram

	```
	┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐
	│ Natural Lang │────▶│ LoRA LLM │────▶│ JSON Config │
	│ Intent │ │ (7B params) │ │ (TMF921/3GPP │
	│ "Deploy URLLC │ │ fine-tuned │ │ /CAMARA/etc) │
	│ slice..." │ │ on telecom │ │ │
	└─────────────────┘ └──────────────┘ └─────────────────┘
	│
	▼
	┌──────────────┐
	│ 4-bit NF4 │
	│ QLoRA (r=64) │
	└──────────────┘
	```

	## Troubleshooting

	\| Problem \| Fix \|
	\|---------\|-----\|
	\| `CUDA OOM` during training \| Lower `LORA_R` to 32, reduce `MAX_LENGTH` to 256, or set `per_device_train_batch_size=1` \|
	\| `liger-kernel` / Triton crash \| Already disabled by default. Do NOT install or enable on T4. \|
	\| `ValueError: Can't find adapter_config.json` \| Training didn't finish or save. Check train.py output for errors. \|
	\| JSON parsing errors in inference \| Check `--temperature` (lower = more deterministic), or re-train with more epochs \|
	\| T4 doesn't support bf16 \| Already handled: `fp16=True, bf16=False` in config \|
	\| Slow training \| Enable T4x2 (not T4x1) on Kaggle for ~2x speedup \|
	\| `HFValidationError` on local paths \| Use absolute paths or ensure adapter directory exists \|

	## Known Issues

	1. Liger kernel on T4: `liger_kernel` Triton operators crash with gradient checkpointing on T4 GPUs. This is a known incompatibility. The fix is disabling liger (`use_liger_kernel=False`), which is the default.

	2. Manual gradient checkpointing: Do not call `model.gradient_checkpointing_enable()` manually when using `SFTTrainer`. The trainer handles this automatically via `gradient_checkpointing=True` in `SFTConfig`. Manual enablement conflicts with PEFT + 4-bit quantization.

	## Citation

	If you use this pipeline, cite:
	```bibtex
	@misc{nraptisss2026telecom,
	title={Telecom Intent-to-Config Pipeline},
	author={Raptis, Nikos},
	year={2026},
	url={https://huggingface.co/nraptisss/telecom-intent-pipeline}
	}
	```

	## References

	- ORION: [arXiv:2603.03667](https://arxiv.org/abs/2603.03667) — Intent-aware orchestration in O-RAN
	- DeepForm: [arXiv:2506.08551](https://arxiv.org/abs/2506.08551) — Reasoning LLM for communication formulation
	- QLoRA: [arXiv:2305.14314](https://arxiv.org/abs/2305.14314) — 4-bit quantization for fine-tuning
	- TRL: [https://github.com/huggingface/trl](https://github.com/huggingface/trl)

	## License

	Apache-2.0