Instructions to use JaydeepR/SmolLM-135M-CPT-LoRA-r32 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps
- Unsloth Studio new
How to use JaydeepR/SmolLM-135M-CPT-LoRA-r32 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for JaydeepR/SmolLM-135M-CPT-LoRA-r32 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for JaydeepR/SmolLM-135M-CPT-LoRA-r32 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for JaydeepR/SmolLM-135M-CPT-LoRA-r32 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="JaydeepR/SmolLM-135M-CPT-LoRA-r32", max_seq_length=2048, )
SmolLM-135M-CPT-LoRA-r32
Continued pre-training of SmolLM-135M on arXiv ML papers via LoRA (r=32).
Side-by-side generation: Base model vs Full Fine-Tuning (bf16) vs this model (CPT LoRA r=32)
Model Description
- Base model: HuggingFaceTB/SmolLM-135M (135M parameters)
- Method: Continued Pre-Training (CPT) with LoRA
- Domain: Machine Learning / arXiv papers (2024β2026)
- Task: Next-token prediction / scientific text generation
Training Details
| Parameter | Value |
|---|---|
| LoRA rank | 32 |
| LoRA alpha | 32 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Trainable params | ~9.7M / 135M (6.77%) |
| Quantization | 4-bit (QLoRA via Unsloth) |
| Batch size | 32 |
| Gradient accumulation | 2 (effective batch: 64) |
| Learning rate | 2e-4 (linear decay) |
| Warmup steps | 100 |
| Epochs | 10 |
| Sequence length | 512 tokens |
| Chunking | 256-word chunks, 20% overlap, packed |
| Hardware | NVIDIA RTX 4090 |
| Training time | ~14 min |
Training Data
- 188 arXiv ML papers (2024β2026), downloaded via the arXiv API
- Papers cleaned: references section removed, appendix preserved
- Split: 138 train / 50 validation
- After chunking + packing: ~5,200 training sequences
Evaluation Results
Evaluated on 50 held-out papers, 50 samples, 20-word prefix β 50-word generation:
| Metric | Base Model | This Model | Ξ |
|---|---|---|---|
| Perplexity | 22.97 | 18.36 | -20.1% |
| Cross-Entropy | 3.134 | 2.910 | -7.1% |
| ROUGE-1 | 0.178 | 0.213 | +19.7% |
| ROUGE-L | 0.114 | 0.143 | +25.4% |
| BERTScore F1 | 0.736 | 0.753 | +2.3% |
| BLEU | 0.016 | 0.022 | +37.5% |
Key Findings from Experiment Loop
This model was selected as the winner from a systematic experiment loop:
- LoRA beats full fine-tuning on small datasets β 138 papers is too few for full FT; LoRA's regularisation helps
- Rank doesn't matter much β r=8/16/32 all plateau at the same eval loss; data is the bottleneck
- Interleaving with large HF datasets didn't help at this data scale β domain signal gets diluted
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model_id = "HuggingFaceTB/SmolLM-135M"
adapter_id = "JaydeepR/SmolLM-135M-CPT-LoRA-r32"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(model, adapter_id)
prompt = "We propose a novel attention mechanism that"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, repetition_penalty=1.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations
- Trained on only 188 papers β sufficient for stylistic adaptation, not factual knowledge
- May hallucinate scientific content (model learns paper structure, not paper facts)
- Optimised for ML paper generation; may not generalise to other scientific domains
- 135M parameter model β limited overall capability
Citation
@misc{smollm135m-cpt-lora,
author = {Jaydeep Raijada},
title = {SmolLM-135M CPT LoRA r=32 β Continued Pre-Training on arXiv ML Papers},
year = {2026},
url = {https://huggingface.co/JaydeepR/SmolLM-135M-CPT-LoRA-r32}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for JaydeepR/SmolLM-135M-CPT-LoRA-r32
Base model
HuggingFaceTB/SmolLM-135M