VideoMaMa / HUGGINGFACE_DEPLOY.md
pizb's picture
gpu support
fdd09ce

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

Hugging Face Spaces Deployment Guide

Key Changes for ZeroGPU Support

1. Import Order (CRITICAL!)

The spaces package must be imported before any CUDA-related packages:

# CORRECT ✅
import spaces
import torch
import cv2

# WRONG ❌
import torch
import spaces  # Too late!

2. GPU Decorator

Use @spaces.GPU decorator on functions that need GPU:

@spaces.GPU
def sam_refine(video_state, point_prompt, click_state, evt):
    initialize_models()  # Lazy load on first use
    # ... GPU code here

@spaces.GPU(duration=120)  # Specify duration for long tasks
def run_videomama_with_sam2(video_state, click_state):
    # ... GPU code here

3. Lazy Model Loading

Models should be initialized on first use, not at app startup:

# Global model variables
sam2_tracker = None
videomama_pipeline = None

def initialize_models():
    global sam2_tracker, videomama_pipeline
    if sam2_tracker is not None:
        return  # Already loaded
    # Load models here...

# In GPU functions:
@spaces.GPU
def inference_function():
    initialize_models()  # Load on first use
    # Use models...

4. Requirements

Add spaces to requirements.txt:

# CRITICAL: Hugging Face ZeroGPU support
spaces

# Other packages...
torch>=2.0.0
gradio==4.31.0

5. README Configuration

Update README.md with hardware specification:

---
title: VideoMaMa
sdk: gradio
sdk_version: 4.31.0
app_file: app.py
python_version: "3.10"
hardware: zero-a10g  # For ZeroGPU Pro
---

Available Hardware Options

For Pro subscribers:

  • zero-a10g - NVIDIA A10G (24GB VRAM) - Recommended
  • zero-a100 - NVIDIA A100 (40GB VRAM) - For larger models

Duration Parameter

The duration parameter specifies GPU allocation time:

@spaces.GPU(duration=60)   # 1 minute - good for single image
@spaces.GPU(duration=120)  # 2 minutes - good for short videos
@spaces.GPU(duration=300)  # 5 minutes - for long processing

Testing Locally

To test without ZeroGPU:

# Mock the spaces decorator for local testing
try:
    import spaces
except ImportError:
    # Mock for local development
    class MockSpaces:
        @staticmethod
        def GPU(duration=None):
            def decorator(func):
                return func
            return decorator if duration else lambda f: f
    spaces = MockSpaces()

Common Errors

Error: "CUDA has been initialized before importing spaces"

Solution: Move import spaces to the very top of app.py, before any other imports.

Error: "GPU time exceeded"

Solution: Increase the duration parameter in @spaces.GPU(duration=X).

Error: "Out of memory"

Solution:

  • Use smaller batch sizes
  • Clear CUDA cache: torch.cuda.empty_cache()
  • Consider requesting zero-a100 hardware

Deployment Checklist

  • import spaces is the FIRST import in app.py
  • All GPU functions have @spaces.GPU decorator
  • Models use lazy loading (initialized on first use)
  • spaces is in requirements.txt
  • README.md specifies hardware: zero-a10g
  • Tested locally without errors
  • Git pushed to Hugging Face Space repository

Files Modified

  1. app.py

    • Added import spaces at the top
    • Added @spaces.GPU decorators to GPU functions
    • Implemented lazy model loading
    • Removed model initialization from main block
  2. requirements.txt

    • Added spaces package
  3. README.md

    • Added hardware configuration
    • Set correct Gradio version

Support

For issues with ZeroGPU: