A newer version of the Gradio SDK is available: 6.13.0
Hugging Face Spaces Deployment Guide
Key Changes for ZeroGPU Support
1. Import Order (CRITICAL!)
The spaces package must be imported before any CUDA-related packages:
# CORRECT ✅
import spaces
import torch
import cv2
# WRONG ❌
import torch
import spaces # Too late!
2. GPU Decorator
Use @spaces.GPU decorator on functions that need GPU:
@spaces.GPU
def sam_refine(video_state, point_prompt, click_state, evt):
initialize_models() # Lazy load on first use
# ... GPU code here
@spaces.GPU(duration=120) # Specify duration for long tasks
def run_videomama_with_sam2(video_state, click_state):
# ... GPU code here
3. Lazy Model Loading
Models should be initialized on first use, not at app startup:
# Global model variables
sam2_tracker = None
videomama_pipeline = None
def initialize_models():
global sam2_tracker, videomama_pipeline
if sam2_tracker is not None:
return # Already loaded
# Load models here...
# In GPU functions:
@spaces.GPU
def inference_function():
initialize_models() # Load on first use
# Use models...
4. Requirements
Add spaces to requirements.txt:
# CRITICAL: Hugging Face ZeroGPU support
spaces
# Other packages...
torch>=2.0.0
gradio==4.31.0
5. README Configuration
Update README.md with hardware specification:
---
title: VideoMaMa
sdk: gradio
sdk_version: 4.31.0
app_file: app.py
python_version: "3.10"
hardware: zero-a10g # For ZeroGPU Pro
---
Available Hardware Options
For Pro subscribers:
zero-a10g- NVIDIA A10G (24GB VRAM) - Recommendedzero-a100- NVIDIA A100 (40GB VRAM) - For larger models
Duration Parameter
The duration parameter specifies GPU allocation time:
@spaces.GPU(duration=60) # 1 minute - good for single image
@spaces.GPU(duration=120) # 2 minutes - good for short videos
@spaces.GPU(duration=300) # 5 minutes - for long processing
Testing Locally
To test without ZeroGPU:
# Mock the spaces decorator for local testing
try:
import spaces
except ImportError:
# Mock for local development
class MockSpaces:
@staticmethod
def GPU(duration=None):
def decorator(func):
return func
return decorator if duration else lambda f: f
spaces = MockSpaces()
Common Errors
Error: "CUDA has been initialized before importing spaces"
Solution: Move import spaces to the very top of app.py, before any other imports.
Error: "GPU time exceeded"
Solution: Increase the duration parameter in @spaces.GPU(duration=X).
Error: "Out of memory"
Solution:
- Use smaller batch sizes
- Clear CUDA cache:
torch.cuda.empty_cache() - Consider requesting
zero-a100hardware
Deployment Checklist
-
import spacesis the FIRST import in app.py - All GPU functions have
@spaces.GPUdecorator - Models use lazy loading (initialized on first use)
-
spacesis in requirements.txt - README.md specifies
hardware: zero-a10g - Tested locally without errors
- Git pushed to Hugging Face Space repository
Files Modified
app.py
- Added
import spacesat the top - Added
@spaces.GPUdecorators to GPU functions - Implemented lazy model loading
- Removed model initialization from main block
- Added
requirements.txt
- Added
spacespackage
- Added
README.md
- Added hardware configuration
- Set correct Gradio version
Support
For issues with ZeroGPU:
- Documentation: https://huggingface.co/docs/hub/spaces-zerogpu
- Forum: https://discuss.huggingface.co/