Mizan: Gated Parallel Arabic Injection Layers

LoRA adapters from Mizan architecture ablations on Qwen3-8B. Each variant tests different injection frequencies and gate initialization values.

Variants

Adapter	Injection Freq	Gate Init	Notes
mizan_every2_gate0.05	every 2	0.05	Dense injection
mizan_every4_gate0.0	every 4	0.0	Zero gate init
mizan_every4_gate0.01	every 4	0.01	Small gate init
mizan_every4_gate0.05	every 4	0.05	Best configuration
mizan_every4_gate0.05_pretrained	every 4	0.05	Pretrained weights
mizan_every4_gate0.1	every 4	0.1	Large gate init
mizan_every8_gate0.05	every 8	0.05	Sparse injection
mizan_every0_gate0.0_zero	None	0.0	No injection baseline

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Adapter

this model