File size: 1,966 Bytes
3ec8edc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ---
license: apache-2.0
language:
- en
tags:
- language-model
- flow-matching
- diffusion
- hypersphere
- discrete-diffusion
datasets:
- tinygsm
- openwebtext
library_name: pytorch
---
# Language Modeling with Hyperspherical Flows
By [Justin Deschenaux](https://jdeschena.com) and [Caglar Gulcehre](https://www.caglar.ai).
[](https://arxiv.org/abs/2605.11125)
[](https://jdeschena.com/blog/sfm)
[](https://github.com/jdeschena/s-flm)
This repo hosts the pretrained checkpoints for **Language Modeling with Hyperspherical Flows** (𝕊-FLM). For the abstract, training/sampling code, and reproduction scripts, see the companion code repo: [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm).
# Checkpoints
𝕊-FLM and the baselines we compare against (AR, MDLM, Duo, FLM, CANDI), trained on **TinyGSM** (250k steps, SmolLM-135M tokenizer) and **OpenWebText** (1M steps, GPT-2 tokenizer).
```
tinygsm/{ar,mdlm,duo}.ckpt
tinygsm/candi/{lr3e-4,lr1e-3}.ckpt
tinygsm/flm/{default,caps}.ckpt
tinygsm/sfm/{sphere_dit_truncated_fixed_no_renorm,
sphere_dit_truncated_adaptive_no_renorm,
sphere_arch_truncated_adaptive_no_renorm}.ckpt
owt/{ar,mdlm,duo,flm,sfm}.ckpt
```
```bash
huggingface-cli download jdeschena/s-flm tinygsm/duo.ckpt --local-dir ./checkpoints
```
Loading and sampling are handled by the code repo — see [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm) for the scripts.
# Citation
```
@misc{deschenaux2026languagemodelinghypersphericalflows,
title={Language Modeling with Hyperspherical Flows},
author={Justin Deschenaux and Caglar Gulcehre},
year={2026},
eprint={2605.11125},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2605.11125},
}
```
|