Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1152

CRMA Fine-Tuner: QLoRA + gradient stability adapter (CRMA + ZClip + PiSSA) for TinyLlama, Gemma, Mistral — ablation results

#1153

by Fourwheels2512 - opened Feb 22

Discussion

Fourwheels2512

Feb 22

Hey leaderboard community! I built a fine-tuning Space that adds gradient stability on top of QLoRA, and wanted to share the ablation results here since many of you evaluate models post-training.

Space: https://huggingface.co/spaces/open-llm-leaderboard/crma-fine-tuner

What is CRMA Fine-Tuner?

CRMA Fine-Tuner is a Hugging Face Space for fine-tuning TinyLlama, Gemma, and Mistral with three stability components layered on top of standard QLoRA:

CRMA (Centered Residual Moving Average) — a low-rank adapter that smooths gradient updates using an exponential moving average of residuals
ZClip — adaptive gradient clipping based on z-score statistics, replacing fixed max_grad_norm
PiSSA — principal singular value initialization for LoRA weights, for faster convergence

The idea came from observing repeated loss spikes during Mistral fine-tuning that standard gradient clipping couldn’t handle cleanly.

Ablation Results (TinyLlama-1.1B on Alpaca-style data)

Config	Final Loss	Spike Count	Notes
Baseline QLoRA	1.42	7	Standard setup
+ ZClip	1.38	3	Adaptive clipping helps
+ PiSSA init	1.35	3	Better initialization
+ CRMA (full)	1.31	1	Full stability stack

Why This Might Matter for Leaderboard Submissions

Models fine-tuned with unstable training often have inconsistent benchmark performance — the spike artifacts can show up as variance in eval. CRMA aims to produce cleaner, more reproducible fine-tunes.

Feedback welcome, especially from anyone who has noticed training instability affecting their leaderboard submissions.

Space: https://huggingface.co/spaces/open-llm-leaderboard/crma-fine-tuner

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment