DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices
Paper • 2605.10933 • Published • 1
This is the 1.2B DECO checkpoint introduced by the paper DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices.
DECO (Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices) is a sparse MoE architecture designed to match the performance of dense Transformers under identical total parameter budgets and training tokens. It is an improved version of the BlockFFN architecture.
You can load and use this model with AutoTokenizer and AutoModelForCausalLM from transformers. Since the model uses a custom architecture, trust_remote_code=True is required.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "SparseLLM/DECO-1.2B"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
).to("cuda").eval()
prompt = "Mixture-of-Experts models are useful because"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=64, do_sample=False)
print(tokenizer.decode(output[0], skip_special_tokens=True))
If you find our work useful for your research, please kindly cite our paper as follows:
@article{song2026deco,
title={{DECO}: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices},
author={Chenyang Song, Weilin Zhao, Xu Han, Chaojun Xiao, Yingfa Chen, Zhiyuan Liu},
journal={arXiv preprint arXiv:2605.10933},
year={2026},
url={https://arxiv.org/pdf/2605.10933},
}
Install from pip and serve model
# Install vLLM from pip: pip install vllm# Start the vLLM server: vllm serve "SparseLLM/DECO-1.2B"# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SparseLLM/DECO-1.2B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'