Mixture of Experts (MoE)
Collection
Sometimes I finetune models specifically to take on expert roles in a MoE configuration, sometimes I find interesting models others have fine tuned. • 8 items • Updated
An 18B parameter Mixture of Experts model combining 8 specialized 3B experts, with 2 experts activated per token by default (configurable up to 4 at inference).
| Expert | Specialization |
|---|---|
| LLM-Data-Science-Llama3.2-3B | Machine learning, neural networks, fine-tuning |
| CreativeWriter-Llama3.2-3B | Fiction writing, story structure, scene development |
| Pythonified-Llama-3.2-3B-Instruct | Python coding, debugging, implementation |
| CogBeTh-Llama3.2-3B | Mental health support, anxiety, stress, self-care |
| ReWiz-Llama-3.2-3B | Step-by-step reasoning, careful analysis |
| ReasonableMath-Llama-3.2-3B-Instruct | Calculation, equations, arithmetic |
| TextSynth-3B | Summarization, text analysis, rewriting |
| PersonalFinance-Llama3.2-3B | Budgeting, investing, financial planning |
All experts are fine-tuned from Llama 3.2 3B Instruct, ensuring architectural compatibility across the MoE. The router was initialized using hidden state representations from domain-specific prompts. Built with mergekit.
Available quants in the GGUF repo: f16, q8_0, q6_k, q5_k_m, q4_k_m, q4_0, q4_1, iq4_nl, iq4_xs, q3_k_l, q3_k_m, q3_k_s, q2_k
For inference with more than 2 active experts, adjust num_experts_per_tok in your inference backend.