apple
/

DiffuCoder-7B-Base

text-diffusion-model

diffusion large language model

Model card Files Files and versions

DiffuCoder-7B-Base / README.md

yizheapple's picture

Update README.md (#1)

77fbe3f verified 10 months ago

|

905 Bytes

	---
	license: unknown
	base_model:
	- Qwen/Qwen2.5-Coder-7B
	tags:
	- code
	- text-diffusion-model
	- diffusion large language model
	---

	### DiffuCoder-7B-Base

	The DiffuCoder-7B-Base model is our foundational masked diffusion LLM for code generation.

	- Training recipe: Using [DiffuLLaMA](https://github.com/HKUNLP/DiffuLLaMA)'s adaptation approach, trained on a large corpus of code: with Stage 1 65B tokens and Stage 2 65B tokens.

	- Benchmarks: Strong baseline performance on HumanEval, MBPP and BigCodeBench.


	#### More details and usage examples:

	- Paper: [DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation](https://arxiv.org/abs/2506.20639)

	- GitHub: https://github.com/apple/ml-diffucoder

	#### Acknowledgement
	To power this HuggingFace model release, we reuse [Dream](https://huggingface.co/Dream-org/Dream-v0-Base-7B)'s modeling architecture and generation utils.