for whisper_large: generation_max_length not respected inside evaluation step of training loop

by sitandon - opened Sep 2, 2025

Sep 2, 2025

I have set generation_max_length as 225 but the input of compute_metrics function has shape [batch_size, 448] (both for predictions and labels). I know whisper's max_length is 448 but why generation_max_length is not respected. As a workaround I have to explicitly truncate the response before passing to batch_decode otherwise my memory requirements (RAM) shoot up beyond 100 GB

sitandon changed discussion title from generation_max_length not respected inside evaluation step of training loop to for whisper_large: generation_max_length not respected inside evaluation step of training loop Sep 2, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment