s-flm / README.md
jdeschena's picture
Create README.md
3ec8edc verified
metadata
license: apache-2.0
language:
  - en
tags:
  - language-model
  - flow-matching
  - diffusion
  - hypersphere
  - discrete-diffusion
datasets:
  - tinygsm
  - openwebtext
library_name: pytorch

Language Modeling with Hyperspherical Flows

By Justin Deschenaux and Caglar Gulcehre.

arXiv Blog Code

This repo hosts the pretrained checkpoints for Language Modeling with Hyperspherical Flows (𝕊-FLM). For the abstract, training/sampling code, and reproduction scripts, see the companion code repo: jdeschena/s-flm.

Checkpoints

𝕊-FLM and the baselines we compare against (AR, MDLM, Duo, FLM, CANDI), trained on TinyGSM (250k steps, SmolLM-135M tokenizer) and OpenWebText (1M steps, GPT-2 tokenizer).

tinygsm/{ar,mdlm,duo}.ckpt
tinygsm/candi/{lr3e-4,lr1e-3}.ckpt
tinygsm/flm/{default,caps}.ckpt
tinygsm/sfm/{sphere_dit_truncated_fixed_no_renorm,
             sphere_dit_truncated_adaptive_no_renorm,
             sphere_arch_truncated_adaptive_no_renorm}.ckpt

owt/{ar,mdlm,duo,flm,sfm}.ckpt
huggingface-cli download jdeschena/s-flm tinygsm/duo.ckpt --local-dir ./checkpoints

Loading and sampling are handled by the code repo — see jdeschena/s-flm for the scripts.

Citation

@misc{deschenaux2026languagemodelinghypersphericalflows,
      title={Language Modeling with Hyperspherical Flows}, 
      author={Justin Deschenaux and Caglar Gulcehre},
      year={2026},
      eprint={2605.11125},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2605.11125}, 
}