Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string
LLaDA-8B-Instruct-s1k-sft
LLaDA-8B-Instruct-s1k-sft is a diffusion-based instruct model post-trained from LLaDA-8B-Instruct on simplescaling/s1K, using MDLM (masked diffusion) and trained with the dLLM framework.
Model Overview
LLaDA-8B-Instruct-s1k-sft has the following features:
- Method: Masked Diffusion Language Modeling (MDLM);
- Framework: dLLM
- Base model: LLaDA-8B-Instruct
- Dataset (SFT): simplescaling/s1K
For broader training and ablation reporting in the dLLM ecosystem, see the dLLM paper.
Eval notes: Metrics use confidence-threshold decoding (alg: confidence_threshold). The primary table is at confidence_threshold = 0.9; full grids sweep confidence_threshold ∈ {0.6, 0.7, 0.8, 0.9} with max_new_tokens ∈ {256, 512}. TPS not recorded for this checkpoint.
Primary results
| Benchmark | max_new_tokens=256 Acc % |
max_new_tokens=512 Acc % |
|---|---|---|
| GSM8K | 82.03 | 82.79 |
| HumanEval | 39.63 | 47.56 |
| MBPP | 34.40 | 23.00 |
| MATH | N/A | 41.24 |
Threshold sweep
| Benchmark | Ï„=0.6 Acc % |
Ï„=0.7 Acc % |
Ï„=0.8 Acc % |
Ï„=0.9 Acc % |
|---|---|---|---|---|
| GSM8K | 73.46 | 78.62 | 80.82 | 82.03 |
| HumanEval | 24.39 | 31.71 | 39.02 | 39.63 |
| MBPP | 30.20 | 32.20 | 33.80 | 34.40 |
| MATH | N/A | N/A | N/A | N/A |
max_new_tokens=256, columns sweep confidence_threshold ∈ {0.6, 0.7, 0.8, 0.9}
| Benchmark | Ï„=0.6 Acc % |
Ï„=0.7 Acc % |
Ï„=0.8 Acc % |
Ï„=0.9 Acc % |
|---|---|---|---|---|
| GSM8K | 75.28 | 78.92 | 81.58 | 82.79 |
| HumanEval | 33.54 | 38.41 | 45.12 | 47.56 |
| MBPP | 24.40 | 24.80 | 23.60 | 23.00 |
| MATH | 34.28 | 38.34 | 40.50 | 41.24 |
max_new_tokens=512, columns sweep confidence_threshold ∈ {0.6, 0.7, 0.8, 0.9}
- Downloads last month
- 33
Model tree for OnAnOrange/LLaDA-8B-Instruct-s1k-sft
Base model
GSAI-ML/LLaDA-8B-Instruct