ResembleAI
/

Dramabox

@@ -86,24 +86,6 @@ server.generate_to_file(prompt=..., gen_duration=30.0)
 | `rescale_scale` (`--rescale-scale`) | `"auto"` | Latent-side CFG std-rescale. The default is a cfg-aware schedule (0 below cfg=2, ramping to 1.0 by cfg=10) that keeps the output peak below 0 dBFS at every cfg. Pass any float in [0, 1] to override or 0 to disable. |
 | `watermark` (`--no-watermark` to disable) | `True` | Apply [Resemble Perth](https://github.com/resemble-ai/Perth) imperceptible neural watermark to the output. Survives MP3/AAC, common edits; ≈ 100 % detection accuracy. |
-### Auto rescale (CFG safety)
-CFG amplifies the latent (`pred = cond + (cfg-1)·(cond - uncond)`). With no compensation, outputs hard-clip at `cfg ≥ 3`. Dramabox automatically applies a CFG-aware std-rescale schedule:
-| cfg | auto rescale | output peak |
-|---|---|---|
-| ≤ 2 | 0.0 (disabled) | safely below 0 dBFS |
-| 3 | 0.6 | ~−1.8 dBFS |
-| 4–8 | 0.8 | ~−1 to −3 dBFS |
-| 9 | 0.9 | ~−2.7 dBFS |
-| 10 | 1.0 | ~−4.4 dBFS |
-No clipping at any CFG, no manual tuning needed. Pass `rescale_scale=<float>` to override.
-### End-of-clip silence patch (long-form safety)
-The base LTX-2.3 DiT was trained on audio ≤ ~20 s and learned a strong end-of-clip silence prior at the next patchifier-aligned latent boundary (frame 513 ≈ 20.4 s). Dramabox automatically interpolates frames 512–513 from their neighbours before VAE decode whenever the output crosses 20.5 s — eliminating the ~30 ms silence dip that would otherwise show up in long generations. No flag, no override needed.
 ## Prompt format
 ```

 | `rescale_scale` (`--rescale-scale`) | `"auto"` | Latent-side CFG std-rescale. The default is a cfg-aware schedule (0 below cfg=2, ramping to 1.0 by cfg=10) that keeps the output peak below 0 dBFS at every cfg. Pass any float in [0, 1] to override or 0 to disable. |
 | `watermark` (`--no-watermark` to disable) | `True` | Apply [Resemble Perth](https://github.com/resemble-ai/Perth) imperceptible neural watermark to the output. Survives MP3/AAC, common edits; ≈ 100 % detection accuracy. |
 ## Prompt format
 ```