Spaces:
Build error
Build error
Quick Start: Agent Architecture
TL;DR
Problem: Python 3.13 doesn't have wheels for AudioCraft dependencies
Solution: Run ML services as separate agents with Python 3.11
Architecture
Main API (Python 3.13, Port 8001)
β HTTP calls
Music Agent (Python 3.11, Port 8002) β Handles MusicGen
Vocal Agent (Python 3.11, Port 8003) β Handles Bark
Processing Agent (Python 3.11, Port 8004) β Handles Demucs
Setup Music Agent (5 minutes)
Step 1: Create Python 3.11 Environment
cd agents\music
py -3.11 -m venv venv
venv\Scripts\activate
Step 2: Install Dependencies
# Install PyTorch first (CPU version)
pip install torch==2.1.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cpu
# Install other dependencies
pip install -r requirements.txt
Step 3: Run the Agent
python main.py
Agent runs on http://localhost:8002
Step 4: Test the Agent
# Health check
curl http://localhost:8002/health
# Generate music
curl -X POST http://localhost:8002/generate `
-H "Content-Type: application/json" `
-d '{"prompt": "Epic orchestral soundtrack", "duration": 10}'
Update Main API to Use Agent
Option A: Direct HTTP Calls
# backend/app/services/music_generation.py
import httpx
class MusicGenerationService:
def __init__(self):
self.agent_url = "http://localhost:8002"
async def generate(self, prompt: str, duration: int):
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.agent_url}/generate",
json={"prompt": prompt, "duration": duration},
timeout=300.0
)
return response.json()
Option B: Celery Tasks (Recommended for Production)
# backend/app/tasks/music_tasks.py
from celery import Celery
import httpx
celery_app = Celery('audioforge', broker='redis://localhost:6379/0')
@celery_app.task
async def generate_music_task(generation_id: str, prompt: str, duration: int):
async with httpx.AsyncClient() as client:
response = await client.post(
"http://music-agent:8002/generate",
json={
"prompt": prompt,
"duration": duration,
"callback_url": f"http://api:8001/callbacks/generation/{generation_id}"
}
)
return response.json()
Docker Compose (Production)
version: '3.8'
services:
# Main API - Python 3.13
api:
build: ./backend
ports: ["8001:8001"]
environment:
- MUSIC_AGENT_URL=http://music-agent:8002
depends_on:
- postgres
- redis
- music-agent
# Music Agent - Python 3.11
music-agent:
build: ./agents/music
ports: ["8002:8002"]
volumes:
- audio_storage:/app/storage
environment:
- MUSICGEN_DEVICE=cpu
postgres:
image: postgres:16-alpine
redis:
image: redis:7-alpine
volumes:
audio_storage:
Start everything:
docker-compose up -d
Benefits
β
No Python version conflicts - Each service uses the right Python version
β
Independent scaling - Scale music generation separately from API
β
Fault isolation - If music agent crashes, API stays up
β
Easy updates - Update ML models without touching API
β
Resource control - Allocate GPU to specific agents
β
Development speed - Teams work on different agents independently
Migration Path
Phase 1: Run Agent Alongside (This Week)
- Keep existing backend code
- Start music agent on port 8002
- Route new requests to agent
- Old requests still use monolithic service
Phase 2: Switch Traffic (Next Week)
- Update orchestrator to call agent
- Monitor performance
- Rollback if issues
Phase 3: Remove Old Code (Week 3)
- Delete monolithic ML code
- Keep only orchestrator
- Full agent architecture
Performance Comparison
Monolithic (Current)
- Startup: 30-60 seconds (load all models)
- Memory: 4-8 GB (all models loaded)
- Scaling: Vertical only (bigger server)
Agent Architecture
- Startup: 5 seconds (API), 30 seconds (agents)
- Memory: 1 GB (API), 2-4 GB per agent
- Scaling: Horizontal (more agent instances)
Cost Analysis
Development
- Initial: +2 weeks (build agents)
- Ongoing: -50% (easier maintenance)
Infrastructure
- Development: Same (run locally)
- Production: -30% (scale only what's needed)
Monitoring
Each agent exposes metrics:
# GET /metrics
{
"requests_total": 1234,
"requests_failed": 12,
"avg_generation_time": 45.2,
"model_loaded": true,
"memory_usage_mb": 2048
}
Aggregate in Grafana dashboard.
Troubleshooting
Agent won't start
# Check Python version
python --version # Should be 3.11.x
# Check dependencies
pip list | findstr torch
Can't connect to agent
# Check if running
curl http://localhost:8002/health
# Check firewall
netstat -ano | findstr :8002
Generation fails
# Check agent logs
# Look for model loading errors
# Verify storage directory exists
Next Steps
- β
Read
AGENT_ARCHITECTURE.mdfor full design - β³ Set up Music Agent (follow steps above)
- β³ Test generation end-to-end
- β³ Update main API orchestrator
- β³ Deploy to staging
- β³ Create Vocal and Processing agents
Questions?
This architecture is industry-standard for ML services:
- OpenAI uses it (separate models as services)
- Hugging Face Inference API uses it
- Stable Diffusion deployments use it
You're in good company! π