Mizan: Gated Parallel Arabic Injection Layers
LoRA adapters from Mizan architecture ablations on Qwen3-8B. Each variant tests different injection frequencies and gate initialization values.
Variants
| Adapter | Injection Freq | Gate Init | Notes |
|---|---|---|---|
| mizan_every2_gate0.05 | every 2 | 0.05 | Dense injection |
| mizan_every4_gate0.0 | every 4 | 0.0 | Zero gate init |
| mizan_every4_gate0.01 | every 4 | 0.01 | Small gate init |
| mizan_every4_gate0.05 | every 4 | 0.05 | Best configuration |
| mizan_every4_gate0.05_pretrained | every 4 | 0.05 | Pretrained weights |
| mizan_every4_gate0.1 | every 4 | 0.1 | Large gate init |
| mizan_every8_gate0.05 | every 8 | 0.05 | Sparse injection |
| mizan_every0_gate0.0_zero | None | 0.0 | No injection baseline |
Usage
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support