error: Offset increment outside graph capture :(

#1
by gabbo1995 - opened

When I try it, an error appears in red and says:
Generation failed: Offset increment outside graph capture encountered unexpectedly.

could be due to the random seed in the generator, but I'm not sure.

Now it says:

Generation failed: CUDA error: device-side assert triggered Search for cudaErrorAssert' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA` to enable device-side assertions.

:(

HuggingFaceM4 org

live debugging ๐Ÿ˜…

Sorry for the insistence, I just got carried away by curiosity. <3 @andito

HuggingFaceM4 org

No worries. It took me some time but I figured out the issue. Lots of people trying to use the space at the same time. I set up a queue for inference, but because there were five possible models, and I couldn't load them all together in the GPU, I had to hot swap them. In that hot swapping there were issues. I decided to solve it by disabling 4 of the models in this space and only keeping the voice cloning demo with the large model. It's, to me, the coolest. Also I am providing 3 default voices, which are amazing.
If you want to clone the space, you can set the unset the env variable for model selection and you get again all five, and can then experiment. But here with the traffic I'm seeing, it's hard to solve it differently. Spaces weren't really designed to host many models and do hot swapping. And with zero GPU I would need to do the graph before inference so no one would experience the low latency in inference, which is the whole point here.

HuggingFaceM4 org

Still, leaving the thread open in case there are other issues. I also added a queue for requests, maybe I'll manage to get an OOM if enough people queue their voices to clone xD

Generation failed: CUDA error: device-side assert triggered Search for cudaErrorAssert' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA` to enable device-side assertions.

Still happening for me!

HuggingFaceM4 org

Yes, this time it was because someone submitted a super long text. The pain is that once it breaks, I need to restart the space to recover. I added a guard against long texts

andito changed discussion status to closed

@andito hey what causes this offset issue if i may ask, i have been trying to implement parallel generation with another tts which has cuda graph in it. and it shows the same error so how do we solve it. (even with making it open multiple instances its not working GPU : gh200)

Sign up or log in to comment