Discrete Audio IsoFLOP Model (discrete-audio-isoflop-9e19-588M-d1152-L12-B128-3d35f7)

A suite of discrete audio models trained for our IsoFLOP study as part of SODA, which is a unified next-token prediction on interleaved semantic, acoustic, and text tokens.

🥤 Project Page: https://soda-audio.github.io

For full usage instructions (e.g., inference code), and more information, please refer to the SODA-4B-base model card.

The details for this particular model is as follows:

compute_budget: 9e19
param_count (non-embedding): 588M
hidden_dim: 1152
num_layers: 12
batch_size: 128
training_step: 53442
hash_key: 3d35f7

📈 WandB: https://wandb.ai/potsawee/marin/groups/IsoFlop/workspace

Downloads last month: 1

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including soda-research/discrete-audio-isoflop-9e19-588M-d1152-L12-B128-3d35f7

Discrete Audio IsoFLOP Models

Collection

IsoFLOP Models trained on Yodas+Emilia+Nemotron from budgets of 3e18 to 3e20 • 64 items • Updated Feb 10