--- license: apache-2.0 language: - en tags: - language-model - flow-matching - diffusion - hypersphere - discrete-diffusion datasets: - tinygsm - openwebtext library_name: pytorch --- # Language Modeling with Hyperspherical Flows By [Justin Deschenaux](https://jdeschena.com) and [Caglar Gulcehre](https://www.caglar.ai). [![arXiv](https://img.shields.io/badge/arXiv-2605.11125-red.svg)](https://arxiv.org/abs/2605.11125) [![Blog](https://img.shields.io/badge/Blog%20%20-8A2BE2)](https://jdeschena.com/blog/sfm) [![Code](https://img.shields.io/badge/Code-181717?logo=github&logoColor=white)](https://github.com/jdeschena/s-flm) This repo hosts the pretrained checkpoints for **Language Modeling with Hyperspherical Flows** (𝕊-FLM). For the abstract, training/sampling code, and reproduction scripts, see the companion code repo: [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm). # Checkpoints 𝕊-FLM and the baselines we compare against (AR, MDLM, Duo, FLM, CANDI), trained on **TinyGSM** (250k steps, SmolLM-135M tokenizer) and **OpenWebText** (1M steps, GPT-2 tokenizer). ``` tinygsm/{ar,mdlm,duo}.ckpt tinygsm/candi/{lr3e-4,lr1e-3}.ckpt tinygsm/flm/{default,caps}.ckpt tinygsm/sfm/{sphere_dit_truncated_fixed_no_renorm, sphere_dit_truncated_adaptive_no_renorm, sphere_arch_truncated_adaptive_no_renorm}.ckpt owt/{ar,mdlm,duo,flm,sfm}.ckpt ``` ```bash huggingface-cli download jdeschena/s-flm tinygsm/duo.ckpt --local-dir ./checkpoints ``` Loading and sampling are handled by the code repo — see [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm) for the scripts. # Citation ``` @misc{deschenaux2026languagemodelinghypersphericalflows, title={Language Modeling with Hyperspherical Flows}, author={Justin Deschenaux and Caglar Gulcehre}, year={2026}, eprint={2605.11125}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2605.11125}, } ```