MDM-Prime-v2-Slimpajama
Paper | Project Page | GitHub
MDM-Prime-v2 is an enhanced version of the MDM-Prime framework. MDM-Prime is a discrete diffusion model enhanced with the Partial masking scheme (Prime). It enables fine-grained denoising and improves generation quality across both image and text domains. Refer to our papers for more details:
- MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal Scaling of Diffusion Language Models.
- Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking.
Model Details
- Dataset: Slimpajama
- Model Size: 1.1B
- Context Length: 2,048
How to Use
To download the weights, one can download the huggingface_hub library via pip install -U huggingface_hub and perform the following python code:
from huggingface_hub import hf_hub_download
path = hf_hub_download(
repo_id="chen-hao-chao/mdm-prime-v2-slimpajama",
filename="${checkpoint_name}"
)
Replace ${checkpoint_name} with mdm-prime-v2-3300flops.pth or mdm-prime-v2-6600flops.pth. Checkpoints with -weight-only indicates that only the weights of the model are included. This enables faster download and inference. This repository is organized as follows:
mdm-prime-v2-slimpajama/
βββ README.md
βββ mdm-prime-v2-3300flops.pth
βββ mdm-prime-v2-3300flops-weight-only.pth
βββ mdm-prime-v2-6600flops.pth
βββ mdm-prime-v2-6600flops-weight-only.pth
For more details regarding the training and inference processes, please refer to our github repository: chen-hao-chao/mdm-prime-v2.
Citing MDM-Prime and MDM-Prime-v2
If you find this repository useful, please consider citing our papers.
@article{chao2026mdmprimev2,
title = {{MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal Scaling of Diffusion Language Models}},
author = {Chen-Hao Chao, Wei-Fang Sun, Junwei Quan, Chun-Yi Lee, Rahul G. Krishnan},
year = {2026},
}
@inproceedings{chao2025mdmprime,
title = {{Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking}},
author = {Chen-Hao Chao, Wei-Fang Sun, Hanwen Liang, Chun-Yi Lee, Rahul G. Krishnan},
booktitle = {Proceedings of the Conference on Neural Information Processing Systems (NeurIPS)},
year = {2025},
}