nraptisss's picture
Upload README.md
a6306b2 verified
# Telecom Intent-to-Config Pipeline
Complete, zero-cost pipeline for fine-tuning and evaluating LLMs that translate natural language network intents into 5G/6G standard-compliant configurations.
## What's Inside
| File | Purpose |
|------|---------|
| `train.py` | QLoRA fine-tuning on your intent dataset (Kaggle T4x2 optimized) |
| `inference.py` | Generate configs from natural language intents (interactive or batch) |
| `merge_and_push.py` | Merge LoRA adapters + push merged model to Hugging Face Hub |
| `benchmark.py` | Evaluate on test set: JSON validity, schema compliance, semantic similarity |
| `kaggle_notebook.ipynb` | Ready-to-run Kaggle notebook (download scripts β†’ train β†’ test β†’ benchmark) |
| `requirements.txt` | Python dependencies |
## Supported Datasets (Your Own)
- `nraptisss/TMF921-intent-to-config-augmented` (~35K, multi-layer)
- `nraptisss/TMF921-intent-to-config-25k` (~25K, multi-layer)
- `nraptisss/telecom-intent-config-sft-10k` (~10K, multi-layer)
All datasets use ChatML `messages` format with system/user/assistant roles.
## Supported Target Layers
| Layer | Standard | Output Format |
|-------|----------|---------------|
| `tmf921` | TM Forum Intent Management API v5.0.0 | JSON intent FVO |
| `intent_3gpp` | 3GPP TS 28.312 Rel-18 | JSON intent payload |
| `camara` | CAMARA NetworkSliceBooking | OpenAPI JSON |
| `etsi_zsm` | ETSI ZSM GS 009-1 | JSON service profile |
| `a1_policy` | O-RAN WG2 A1 Interface | JSON policy |
| `o1_nrm` | 3GPP TS 28.541 NR NRM | YANG/XML-style JSON |
## Quick Start: Kaggle (Free T4x2 GPU)
### Option A β€” Notebook (easiest, 5 minutes setup)
1. Go to [Kaggle β†’ New Notebook](https://www.kaggle.com/code)
2. Add **GPU T4 x2** accelerator (right panel)
3. Paste the contents of `kaggle_notebook.ipynb` into the first cell
4. Run all cells top to bottom
### Option B β€” Terminal
```bash
pip install -q transformers trl peft accelerate bitsandbytes datasets sentence-transformers huggingface-hub
wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/train.py
wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/inference.py
wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/merge_and_push.py
wget https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/benchmark.py
python train.py # 2-3 hours on T4x2
python inference.py --intent "Deploy URLLC slice for factory automation with 1ms latency"
python benchmark.py --max_samples 100
python merge_and_push.py # pushes to your hub
```
## Hyperparameters (T4x2 Optimized)
| Parameter | Value | Rationale |
|-----------|-------|-----------|
| Model | Qwen2.5-7B-Instruct | Fits 16GB VRAM, strong reasoning |
| Quantization | 4-bit NF4 | QLoRA, enables 7B on T4 |
| LoRA rank | 64 | Balances capacity and memory |
| LoRA alpha | 16 | Standard Ξ± = r/4 |
| Batch size | 1 per GPU | Fits T4 VRAM |
| Gradient accumulation | 4 | Effective batch = 4 |
| Learning rate | 2e-4 | 10Γ— base for LoRA (TRL recommendation) |
| Max length | 512 | Covers most intent→config pairs |
| Epochs | 3 | Sufficient for ~35K samples |
| Liger kernel | **Disabled** | Crashes on T4 with gradient checkpointing |
| FP16 | True | T4 has no bf16 support |
**Note:** `liger-kernel` is intentionally **disabled** by default (`use_liger_kernel=False`). It causes Triton crashes on T4 GPUs when combined with gradient checkpointing. Only enable it if training on A100 / L40 / H100.
## Expected Results
After 3 epochs on `TMF921-intent-to-config-augmented`:
- **Training time**: ~2–3 hours on Kaggle T4x2
- **JSON validity**: >90% on test set
- **Schema compliance**: >85% (key presence check)
- **Model size**: ~14GB merged (FP16), ~200MB LoRA adapters
## Customization
### Use a different dataset
Edit `train.py`:
```python
DATASET_NAME = "your-username/your-dataset"
DATASET_CONFIG = "default"
TRAIN_SPLIT = "train"
```
### Use a different base model
Edit `train.py`:
```python
MODEL_NAME = "meta-llama/Llama-3.1-8B-Instruct"
```
### Change LoRA parameters (if OOM)
Edit `train.py`:
```python
LORA_R = 32 # lower = less memory, less capacity
LORA_ALPHA = 8
MAX_LENGTH = 256 # shorter sequences
```
### Push to your own hub
Edit `train.py` and `merge_and_push.py`:
```python
HUB_MODEL_ID = "your-username/your-model-name"
```
### Enable Liger kernel (A100/H100 only)
Edit `train.py`:
```python
use_liger_kernel=True # ONLY on A100 / L40 / H100. WILL crash on T4.
```
And install: `pip install liger-kernel>=0.5.0`
## Architecture Diagram
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Natural Lang │────▢│ LoRA LLM │────▢│ JSON Config β”‚
β”‚ Intent β”‚ β”‚ (7B params) β”‚ β”‚ (TMF921/3GPP β”‚
β”‚ "Deploy URLLC β”‚ β”‚ fine-tuned β”‚ β”‚ /CAMARA/etc) β”‚
β”‚ slice..." β”‚ β”‚ on telecom β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 4-bit NF4 β”‚
β”‚ QLoRA (r=64) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Troubleshooting
| Problem | Fix |
|---------|-----|
| `CUDA OOM` during training | Lower `LORA_R` to 32, reduce `MAX_LENGTH` to 256, or set `per_device_train_batch_size=1` |
| `liger-kernel` / Triton crash | Already disabled by default. Do NOT install or enable on T4. |
| `ValueError: Can't find adapter_config.json` | Training didn't finish or save. Check train.py output for errors. |
| JSON parsing errors in inference | Check `--temperature` (lower = more deterministic), or re-train with more epochs |
| T4 doesn't support bf16 | Already handled: `fp16=True, bf16=False` in config |
| Slow training | Enable T4x2 (not T4x1) on Kaggle for ~2x speedup |
| `HFValidationError` on local paths | Use absolute paths or ensure adapter directory exists |
## Known Issues
1. **Liger kernel on T4**: `liger_kernel` Triton operators crash with gradient checkpointing on T4 GPUs. This is a known incompatibility. The fix is disabling liger (`use_liger_kernel=False`), which is the default.
2. **Manual gradient checkpointing**: Do not call `model.gradient_checkpointing_enable()` manually when using `SFTTrainer`. The trainer handles this automatically via `gradient_checkpointing=True` in `SFTConfig`. Manual enablement conflicts with PEFT + 4-bit quantization.
## Citation
If you use this pipeline, cite:
```bibtex
@misc{nraptisss2026telecom,
title={Telecom Intent-to-Config Pipeline},
author={Raptis, Nikos},
year={2026},
url={https://huggingface.co/nraptisss/telecom-intent-pipeline}
}
```
## References
- **ORION**: [arXiv:2603.03667](https://arxiv.org/abs/2603.03667) β€” Intent-aware orchestration in O-RAN
- **DeepForm**: [arXiv:2506.08551](https://arxiv.org/abs/2506.08551) β€” Reasoning LLM for communication formulation
- **QLoRA**: [arXiv:2305.14314](https://arxiv.org/abs/2305.14314) β€” 4-bit quantization for fine-tuning
- **TRL**: [https://github.com/huggingface/trl](https://github.com/huggingface/trl)
## License
Apache-2.0