| --- |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - language-model |
| - flow-matching |
| - diffusion |
| - hypersphere |
| - discrete-diffusion |
| datasets: |
| - tinygsm |
| - openwebtext |
| library_name: pytorch |
| --- |
| |
| # Language Modeling with Hyperspherical Flows |
|
|
| By [Justin Deschenaux](https://jdeschena.com) and [Caglar Gulcehre](https://www.caglar.ai). |
|
|
| [](https://arxiv.org/abs/2605.11125) |
| [](https://jdeschena.com/blog/sfm) |
| [](https://github.com/jdeschena/s-flm) |
|
|
| This repo hosts the pretrained checkpoints for **Language Modeling with Hyperspherical Flows** (π-FLM). For the abstract, training/sampling code, and reproduction scripts, see the companion code repo: [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm). |
|
|
| # Checkpoints |
|
|
| π-FLM and the baselines we compare against (AR, MDLM, Duo, FLM, CANDI), trained on **TinyGSM** (250k steps, SmolLM-135M tokenizer) and **OpenWebText** (1M steps, GPT-2 tokenizer). |
|
|
| ``` |
| tinygsm/{ar,mdlm,duo}.ckpt |
| tinygsm/candi/{lr3e-4,lr1e-3}.ckpt |
| tinygsm/flm/{default,caps}.ckpt |
| tinygsm/sfm/{sphere_dit_truncated_fixed_no_renorm, |
| sphere_dit_truncated_adaptive_no_renorm, |
| sphere_arch_truncated_adaptive_no_renorm}.ckpt |
| |
| owt/{ar,mdlm,duo,flm,sfm}.ckpt |
| ``` |
|
|
| ```bash |
| huggingface-cli download jdeschena/s-flm tinygsm/duo.ckpt --local-dir ./checkpoints |
| ``` |
|
|
| Loading and sampling are handled by the code repo β see [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm) for the scripts. |
|
|
| # Citation |
|
|
| ``` |
| @misc{deschenaux2026languagemodelinghypersphericalflows, |
| title={Language Modeling with Hyperspherical Flows}, |
| author={Justin Deschenaux and Caglar Gulcehre}, |
| year={2026}, |
| eprint={2605.11125}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.LG}, |
| url={https://arxiv.org/abs/2605.11125}, |
| } |
| ``` |
|
|