Qwen2-57B-A14B-Instruct-MoNE-48-gsm8k-100

This repository contains a structured pruned variant of Qwen2-57B-A14B-Instruct using the MoNE (Mixture-of-Novice Experts) framework proposed in our paper.

*## Model Overview

  • Base Model: Qwen2-57B-A14B-Instruct
  • Method: MoNE structured expert pruning
  • Remaining Experts: 48
  • Calibration Set: gsm8k-100
  • Architecture: Mixture-of-Experts (MoE)
  • Framework: Transformers-compatible
    This checkpoint replaces redundant experts with lightweight novice experts via structured pruning, aiming to reduce compute while preserving performance.

Paper

Title: MoNE: Replacing Redundant Experts with Lightweight Novices for Structured Pruning of MoE
Authors: Geng Zhang, Yuxuan Han, Yuxuan Lou, Yiqi Zhang, Wangbo Zhao, Yang You
arXiv: arXiv:2507.00390

Downloads last month
16
Safetensors
Model size
45B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MoNE-Pruning/Qwen2-57B-A14B-Instruct-MoNE-48-gsm8k-100

Finetuned
(2)
this model

Paper for MoNE-Pruning/Qwen2-57B-A14B-Instruct-MoNE-48-gsm8k-100