Spaces:
Running on Zero
Fix ZeroGPU duration: dynamic per-sentence sizing, cap at 120s
Root cause: @spaces.GPU(duration=600) was rejected at decorator registration because 600 exceeds ZeroGPU per-call cap (120s on PRO). That made every call fail with The requested GPU duration (600s) is larger than the maximum allowed, including single-sentence requests.
Fix: switch duration= to a callable. Empirically the dominant compute on a warm server is the 30-step euler denoise (~2.5s per sentence), plus fixed ~10-12s for Gemma + VAE encode + decode. So:
window = num_sentences * 3 + 12, clamped to [30, 120]
Short prompts pay near-overhead-only time, longer prompts scale linearly, and the value is always within ZeroGPU per-call ceiling.
Known limitation: multi-chunk long-form runs still hold one GPU window for the whole loop, so total wall time must fit under 120s. Per-chunk @spaces.GPU acquisitions left for a follow-up.
refs/pr/3 ref