| --- |
| license: apache-2.0 |
| pipeline_tag: text-generation |
| datasets: |
| - codeparrot/github-code-clean |
| - bigcode/starcoderdata |
| - bigcode/the-stack-smol |
| tags: |
| - diffusion |
| - llm |
| - diffreaper |
| - dllm |
| - mercury |
| language: |
| - en |
| --- |
| # DiffReaper 3 |
|
|
| DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction. |
| Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora. |
|
|
| ## Model Details |
| - **Architecture:** 24-Layer Transformer Encoder |
| - **Hidden Dimension:** 2048 |
| - **Attention Heads:** 16 |
| - **Objective:** Discrete Masked Diffusion (Mercury-style) |
| - **Training Precision:** BF16 |
| - **Context Window:** 1024 tokens |