{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Telecom Intent-to-Config Pipeline\n", "\n", "Fine-tune Qwen2.5-7B on your TMF921 intent dataset using QLoRA on Kaggle T4x2.\n", "\n", "## Step 1: Install Dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "!pip install -q transformers trl peft accelerate bitsandbytes datasets liger-kernel sentence-transformers huggingface-hub\n", "!pip install -q --upgrade transformers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: Login to Hugging Face\n", "\n", "Get your token from https://huggingface.co/settings/tokens (needs `write` access)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "from huggingface_hub import notebook_login\n", "notebook_login()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: Download Scripts from Hub" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "!wget -q https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/train.py\n", "!wget -q https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/inference.py\n", "!wget -q https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/merge_and_push.py\n", "!wget -q https://huggingface.co/nraptisss/telecom-intent-pipeline/resolve/main/benchmark.py\n", "!ls -la" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4: Run Training\n", "\n", "This takes ~2-3 hours on Kaggle T4x2 for 3 epochs on 30K samples.\n", "\n", "**Edit `train.py` first** if you want to change dataset, model, or hyperparameters." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "!python train.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 5: Test Inference" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "!python inference.py --intent \"Deploy a low-latency URLLC slice for autonomous drones in the harbor zone with 1ms latency and 99.999% reliability\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 6: Merge & Push to Hub" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "!python merge_and_push.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 7: Benchmark on Test Set" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "!python benchmark.py --max_samples 100 --output benchmark_results.json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## View Results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "source": [ "import json\n", "with open('benchmark_results.json', 'r') as f:\n", " data = json.load(f)\n", "\n", "print(f\"JSON Valid Rate: {data['summary']['json_valid_rate']:.1%}\")\n", "print(f\"Schema Compliance: {data['summary']['avg_schema_compliance']:.1%}\")\n", "if data['summary'].get('semantic_similarity_avg'):\n", " print(f\"Semantic Similarity: {data['summary']['semantic_similarity_avg']:.3f}\")\n", "\n", "for layer, s in data['summary']['per_layer'].items():\n", " print(f\" {layer:20s} valid={s['valid_rate']:.1%} compliance={s['avg_compliance']:.1%}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.10.0" } }, "nbformat": 4, "nbformat_minor": 4 }