File size: 1,966 Bytes
3ec8edc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
license: apache-2.0
language:
- en
tags:
- language-model
- flow-matching
- diffusion
- hypersphere
- discrete-diffusion
datasets:
- tinygsm
- openwebtext
library_name: pytorch
---

# Language Modeling with Hyperspherical Flows

By [Justin Deschenaux](https://jdeschena.com) and [Caglar Gulcehre](https://www.caglar.ai).

[![arXiv](https://img.shields.io/badge/arXiv-2605.11125-red.svg)](https://arxiv.org/abs/2605.11125)
[![Blog](https://img.shields.io/badge/Blog%20%20-8A2BE2)](https://jdeschena.com/blog/sfm)
[![Code](https://img.shields.io/badge/Code-181717?logo=github&logoColor=white)](https://github.com/jdeschena/s-flm)

This repo hosts the pretrained checkpoints for **Language Modeling with Hyperspherical Flows** (𝕊-FLM). For the abstract, training/sampling code, and reproduction scripts, see the companion code repo: [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm).

# Checkpoints

𝕊-FLM and the baselines we compare against (AR, MDLM, Duo, FLM, CANDI), trained on **TinyGSM** (250k steps, SmolLM-135M tokenizer) and **OpenWebText** (1M steps, GPT-2 tokenizer).

```
tinygsm/{ar,mdlm,duo}.ckpt
tinygsm/candi/{lr3e-4,lr1e-3}.ckpt
tinygsm/flm/{default,caps}.ckpt
tinygsm/sfm/{sphere_dit_truncated_fixed_no_renorm,
             sphere_dit_truncated_adaptive_no_renorm,
             sphere_arch_truncated_adaptive_no_renorm}.ckpt

owt/{ar,mdlm,duo,flm,sfm}.ckpt
```

```bash
huggingface-cli download jdeschena/s-flm tinygsm/duo.ckpt --local-dir ./checkpoints
```

Loading and sampling are handled by the code repo — see [`jdeschena/s-flm`](https://github.com/jdeschena/s-flm) for the scripts.

# Citation

```
@misc{deschenaux2026languagemodelinghypersphericalflows,
      title={Language Modeling with Hyperspherical Flows}, 
      author={Justin Deschenaux and Caglar Gulcehre},
      year={2026},
      eprint={2605.11125},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2605.11125}, 
}
```