Discrete Audio IsoFLOP Model (discrete-audio-isoflop-9e19-1.68B-d1920-L19-B64-e9e495)

A suite of discrete audio models trained for our IsoFLOP study as part of SODA, which is a unified next-token prediction on interleaved semantic, acoustic, and text tokens.

🥤 Project Page: https://soda-audio.github.io

For full usage instructions (e.g., inference code), and more information, please refer to the SODA-4B-base model card.

The details for this particular model is as follows:

  • compute_budget: 9e19
  • param_count (non-embedding): 1.68B
  • hidden_dim: 1920
  • num_layers: 19
  • batch_size: 64
  • training_step: 33678
  • hash_key: e9e495

📈 WandB: https://wandb.ai/potsawee/marin/groups/IsoFlop/workspace

Downloads last month
3
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including soda-research/discrete-audio-isoflop-9e19-1.68B-d1920-L19-B64-e9e495