barathanasln
/

phonetic-whisper-mlx-broad-multi

@@ -166,41 +166,11 @@ this restriction; see [License](#license).
 ## Training procedure
-### Hyperparameters
-| Parameter | Value |
-|---|---|
-| Base model | `mlx-community/whisper-large-v3-mlx` |
-| Dtype | float32 |
-| Encoder | frozen |
-| Decoder | trainable (~67M params, 92.5% of total) |
-| Optimizer | AdamW |
-| Peak learning rate | 5e-5 |
-| LR schedule | linear warmup 500 → cosine decay |
-| Min LR | 1e-6 |
-| Batch size | 10 |
-| Gradient accumulation | 1 |
-| Gradient clipping | global max-norm 1.0 |
-| Validation cadence | every 1,000 steps |
-| Validation batch size | 4 |
-| Steps configured | 30,000 |
-| Steps actually run | 30,000 (no early stop) |
-| Random seed | 42 |
-| MLX allocator cache cap | 20 GB |
 ### Training-time language token
-All training samples use `<|en|>` as the start-of-transcript prefix
-regardless of source-audio language; the token is overloaded as
-"emit IPA". This is intentional — phonetic transcription is meant to
-be language-agnostic, so the decoder is trained without a per-language
-signal. **Pass `language="en"` at inference.**
-### Hardware and runtime
-Trained on a single Apple Mac Studio M3 Ultra (96 GB unified memory).
-Total wall-clock: 1,629 minutes (~27 hours). Step time ≈ 3.0 s/step
-average at fp32, batch 10, on whisper-large-v3.
 ## Evaluation

 ## Training procedure
+Decoder-only fine-tune, encoder frozen, AdamW with linear warmup and cosine decay, fp32, on a single Apple M3 Ultra with [MLX](https://github.com/ml-explore/mlx). Full hyperparameters, launchers, and reproduction commands are in the [GitHub repository](https://github.com/barathanaslan/phonetic-whisper-mlx).
 ### Training-time language token
+All training samples use `<|en|>` as the start-of-transcript prefix regardless of source-audio language; the token is overloaded as "emit IPA". This is intentional — phonetic transcription is meant to be language-agnostic, so the decoder is trained without a per-language signal. **Pass `language="en"` at inference.**
 ## Evaluation