BAGEL-MoE-7B-GEN

This repository contains compressed variants of BAGEL-7B-MoT based on our paper:

Understanding and Harnessing Sparsity in Unified Multimodal Models [arXiv] · [GitHub]

We study sparsity in unified multimodal models that jointly handle image understanding and generation. Key findings:

  • Understanding components tolerate substantial compression with minimal quality loss.
  • Generation components are highly sensitive to pruning.
  • We propose MoE Adaptation: partition the generation module into multiple experts and activate them sparsely, recovering performance while reducing active parameters.

The compressed models reduce the active parameters in the generation module by half while maintaining comparable or even improved GenEval scores.

MoE Adaptation

Model Active Experts Repo
BAGEL-MoE-7B-GEN-32to16 16 / 32 LLM-Drop/BAGEL-MoE-7B-GEN-32to16
BAGEL-MoE-7B-GEN-16to8 8 / 16 LLM-Drop/BAGEL-MoE-7B-GEN-16to8

GenEval Results

Model SO TO CT CL POS ATTR ALL
BAGEL-7B-MoT (original) 0.99 0.94 0.81 0.95 0.72 0.77 0.86
BAGEL-MoE-7B-GEN-32to16 0.99 0.94 0.87 0.93 0.79 0.78 0.89
BAGEL-MoE-7B-GEN-16to8 1.00 0.92 0.82 1.00 0.77 0.83 0.89

Installation

conda create -n efficient_ug python=3.10
conda activate efficient_ug
pip install -r requirements.txt

Evaluation

# GenEval evaluation
bash scripts/eval/bagel/run_geneval_wr.sh

Citation

@article{he2025sparsity,
  title   = {Understanding and Harnessing Sparsity in Unified Multimodal Models},
  author  = {He, Shwai and others},
  journal = {arXiv preprint arXiv:2512.02351},
  year    = {2025}
}
Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LLM-Drop/BAGEL-MoE-7B-GEN-32to16

Base model

Qwen/Qwen2.5-7B
Finetuned
(28)
this model

Paper for LLM-Drop/BAGEL-MoE-7B-GEN-32to16