SmolLM-TS-360M-it
A 360M parameter instruction-tuned language model specialised in 3GPP and ETSI telecommunications standards. Trained via full fine-tuning on TeleSpec-Data followed by LoRA instruction fine-tuning on Alpaca.
Part of the SmolLM-TS series β small language models adapted exclusively to telecommunications standards documents, with zero arXiv or web content in the training corpus.
Looking for the base pretrained version? See nareshmodina/SmolLM-TS-360M
Model Details
| Base model | HuggingFaceTB/SmolLM2-360M |
| Parameters | 360M |
| Training | Full FT pretrain β LoRA SFT (Alpaca) |
| Pretraining data | TeleSpec-Data (1.87B tokens) |
| SFT data | Alpaca 52k |
| Context length | 4096 tokens |
| Hardware | 3Γ NVIDIA L40S (48GB) |
Training
Stage 1 β Full fine-tuning on TeleSpec-Data
All model weights updated on 457,160 packed 4096-token blocks (1.87B tokens) from 38,302 standards documents β 15,054 3GPP (Rel-8 to Rel-19) and 23,248 ETSI documents spanning 15 working groups (2000β2024). Zero arXiv or web content β 100% standards text.
- Epochs: 2 β Effective batch size: 128 β LR: 5e-5 (cosine)
Stage 2 β LoRA instruction fine-tuning
LoRA (r=16, Ξ±=32) on Alpaca 52k. Base weights frozen to preserve domain knowledge.
- Epochs: 1 β LR: 1e-5
Evaluation
Evaluated on Tele-Eval using the metrics defined in Maatouk et al. (2024) β standards-derived questions only (standard_* IDs, 10,000 examples, seed 42).
| Model | Ans-PPL β | SemScore β |
|---|---|---|
| SmolLM2-360M-alpaca (base + Alpaca SFT) | 10.86 | 0.6216 |
| SmolLM-TS-360M-it (ours) | 8.62 | 0.6572 |
20.6% Ans-PPL reduction vs base+SFT baseline. Comparison across model sizes:
| Model | Ans-PPL β | SemScore β |
|---|---|---|
| SmolLM-TS-135M-it | 9.19 | 0.6504 |
| SmolLM-TS-360M-it | 8.62 | 0.6572 |
Clear improvement with model size on both metrics.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "nareshmodina/SmolLM-TS-360M-it"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, dtype=torch.bfloat16, device_map="auto"
)
prompt = (
"The following is a question about telecommunications and networking.\n"
"Question: What is the purpose of the RRC Connection Establishment procedure in LTE?\n"
"Answer:"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=150,
do_sample=False,
repetition_penalty=1.3,
)
answer = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(answer)
Note: Use the Alpaca-style
Question: ... Answer:prompt format for best results.
Limitations
- Alpaca SFT β trained for Q&A style responses, not multi-turn conversation
- Standards only β strong 3GPP/ETSI knowledge, limited general telecom knowledge
- Not for production β intended for research purposes only
Links
- π¦ Dataset: nareshmodina/TeleSpec-Data
- π€ Base model: nareshmodina/SmolLM-TS-360M
- π Benchmark: AliMaatouk/Tele-Eval
- ποΈ Collection: nareshmodina/SmolLM-TS
Citation
@misc{modina2025smollmts,
author = {Naresh Modina},
title = {SmolLM-TS: Small Language Models for Telecommunications Standards},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/nareshmodina/SmolLM-TS-360M-it}
}
@misc{maatouk2024telellms,
title = {Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications},
author = {Ali Maatouk and Kenny Chirino Ampudia and Rex Ying and Leandros Tassiulas},
year = {2024},
eprint = {2409.05314},
archivePrefix = {arXiv},
primaryClass = {cs.IT}
}
- Downloads last month
- 9
Model tree for nareshmodina/SmolLM-TS-360M-it
Base model
HuggingFaceTB/SmolLM2-360M