Spaces:
Running on CPU Upgrade
CRMA Fine-Tuner: QLoRA + gradient stability adapter (CRMA + ZClip + PiSSA) for TinyLlama, Gemma, Mistral — ablation results
Hey leaderboard community! I built a fine-tuning Space that adds gradient stability on top of QLoRA, and wanted to share the ablation results here since many of you evaluate models post-training.
Space: https://huggingface.co/spaces/open-llm-leaderboard/crma-fine-tuner
What is CRMA Fine-Tuner?
CRMA Fine-Tuner is a Hugging Face Space for fine-tuning TinyLlama, Gemma, and Mistral with three stability components layered on top of standard QLoRA:
- CRMA (Centered Residual Moving Average) — a low-rank adapter that smooths gradient updates using an exponential moving average of residuals
- ZClip — adaptive gradient clipping based on z-score statistics, replacing fixed
max_grad_norm - PiSSA — principal singular value initialization for LoRA weights, for faster convergence
The idea came from observing repeated loss spikes during Mistral fine-tuning that standard gradient clipping couldn’t handle cleanly.
Ablation Results (TinyLlama-1.1B on Alpaca-style data)
| Config | Final Loss | Spike Count | Notes |
|---|---|---|---|
| Baseline QLoRA | 1.42 | 7 | Standard setup |
| + ZClip | 1.38 | 3 | Adaptive clipping helps |
| + PiSSA init | 1.35 | 3 | Better initialization |
| + CRMA (full) | 1.31 | 1 | Full stability stack |
Why This Might Matter for Leaderboard Submissions
Models fine-tuned with unstable training often have inconsistent benchmark performance — the spike artifacts can show up as variance in eval. CRMA aims to produce cleaner, more reproducible fine-tunes.
Feedback welcome, especially from anyone who has noticed training instability affecting their leaderboard submissions.
Space: https://huggingface.co/spaces/open-llm-leaderboard/crma-fine-tuner