Model Mad Science
Collection
Experimental models with weird architectures. Frankenmerges, layer duplications, expert pruning, expert merging, and whatever else seems worth trying. • 4 items • Updated
A fine-tuned version of sandeshrajx/Qwen3.5-24B-A3B-REAP-0.32, itself based on Qwen3.5-35B-A3B. The goal of this project is simple: the best reasoning model that can comfortably fit and run on a 16GB GPU.
Jackrong's distills
This qwen3_5_moe_text model was trained 2x faster with Unsloth
Base model
sandeshrajx/Qwen3.5-24B-A3B-REAP-0.32