Spaces:
Build error
Next Steps: Get Music Generation Working
TL;DR
Run these commands to get music generation working in 30 minutes:
cd agents\music
py -3.11 -m venv venv
.\venv\Scripts\activate
pip install torch==2.1.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cpu
pip install fastapi uvicorn pydantic httpx python-dotenv
pip install transformers librosa soundfile numpy
pip install git+https://github.com/facebookresearch/audiocraft.git
python main.py
Then test:
curl http://localhost:8002/health
Detailed Steps
Step 1: Navigate to Music Agent (1 minute)
cd C:\Users\Keith\AudioForge\agents\music
Step 2: Create Python 3.11 Environment (2 minutes)
# Create virtual environment with Python 3.11
py -3.11 -m venv venv
# Activate it
.\venv\Scripts\activate
# Verify Python version
python --version
# Should show: Python 3.11.9
Step 3: Install PyTorch (5-10 minutes)
# Install PyTorch 2.1.0 CPU version
pip install torch==2.1.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cpu
This downloads ~200MB. Wait for completion.
Step 4: Install Web Framework (1 minute)
pip install fastapi uvicorn[standard] pydantic httpx python-dotenv
Step 5: Install Audio Libraries (2 minutes)
pip install transformers librosa soundfile "numpy<2.0.0"
Step 6: Install AudioCraft (5-10 minutes)
# This clones and installs from GitHub
pip install git+https://github.com/facebookresearch/audiocraft.git
Note: This may show warnings about version conflicts. That's okay - AudioCraft will work.
Step 7: Create Storage Directory (10 seconds)
mkdir -p storage\audio\music
Step 8: Start the Agent (5 seconds)
python main.py
You should see:
INFO: Started server process [12345]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8002
Step 9: Test the Agent (1 minute)
Open a NEW PowerShell window (keep the agent running):
# Health check
curl http://localhost:8002/health
# Should return:
# {
# "status": "healthy",
# "python_version": "3.11.9",
# "torch_available": true,
# "audiocraft_available": true,
# "device": "cpu"
# }
Step 10: Generate Music! (1-2 minutes)
# Generate 10 seconds of music
curl -X POST http://localhost:8002/generate `
-H "Content-Type: application/json" `
-d '{"prompt": "Epic orchestral soundtrack", "duration": 10}'
First time: Downloads model (~1.5GB) - takes 5-10 minutes
After that: Generates in 30-60 seconds
Response:
{
"task_id": "music_abc123",
"status": "completed",
"audio_path": "./storage/audio/music/music_abc123.wav",
"metadata": {
"duration": 10,
"sample_rate": 32000,
"model": "facebook/musicgen-small"
}
}
Step 11: Listen to Your Music! π΅
# Open the generated file
start .\storage\audio\music\music_abc123.wav
Troubleshooting
Error: "py -3.11 not found"
Python 3.11 not installed. Install from: https://www.python.org/downloads/release/python-3119/
Error: "torch not found" when running
You forgot to activate the virtual environment:
.\venv\Scripts\activate
Error: "audiocraft not found"
Installation might have failed. Try:
pip install --no-cache-dir git+https://github.com/facebookresearch/audiocraft.git
Error: "CUDA out of memory"
You're on CPU mode, this shouldn't happen. But if it does:
# Set environment variable
$env:MUSICGEN_DEVICE="cpu"
python main.py
Agent starts but health check fails
Check if port 8002 is already in use:
netstat -ano | findstr :8002
If yes, kill the process or change port in main.py.
What's Next?
Option A: Integrate with Main API
Update backend/app/services/orchestrator.py:
import httpx
class Orchestrator:
def __init__(self):
self.music_agent_url = "http://localhost:8002"
async def generate_music(self, prompt: str, duration: int):
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.music_agent_url}/generate",
json={"prompt": prompt, "duration": duration},
timeout=300.0
)
return response.json()
Option B: Test from Frontend
The frontend already has the generation form. Just make sure:
- Backend is running (port 8001)
- Music Agent is running (port 8002)
- Backend calls agent
Option C: Build More Agents
Repeat this process for:
- Vocal Agent (port 8003) - Bark for vocals
- Processing Agent (port 8004) - Demucs for stems
Performance Tips
Speed Up Generation
Use smaller model:
{"model": "facebook/musicgen-small"} // Faster {"model": "facebook/musicgen-medium"} // Better quality {"model": "facebook/musicgen-large"} // Best quality, slowestShorter duration:
{"duration": 10} // 30 seconds generation {"duration": 30} // 90 seconds generationUse GPU (if available):
# Install CUDA version of PyTorch pip install torch==2.1.0+cu118 torchaudio==2.1.0+cu118 --index-url https://download.pytorch.org/whl/cu118
Reduce Memory Usage
- Use smaller model (see above)
- Generate shorter clips
- Close other applications
Production Deployment
Docker (Recommended)
# Build image
docker build -t audioforge-music-agent ./agents/music
# Run container
docker run -p 8002:8002 -v ${PWD}/storage:/app/storage audioforge-music-agent
Docker Compose (Best)
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f music-agent
# Stop services
docker-compose down
Success Criteria
You'll know it's working when:
- β
Health check returns
"status": "healthy" - β
Generate request returns
"status": "completed" - β
Audio file exists in
storage/audio/music/ - β Audio file plays and sounds like music
- β Subsequent generations are faster (model cached)
Timeline
| Task | Time | Cumulative |
|---|---|---|
| Setup environment | 2 min | 2 min |
| Install PyTorch | 10 min | 12 min |
| Install dependencies | 5 min | 17 min |
| Install AudioCraft | 10 min | 27 min |
| Start agent | 1 min | 28 min |
| Test & verify | 2 min | 30 min |
| First generation | 10 min | 40 min |
| Subsequent generations | 1 min | - |
Total to first music: ~40 minutes (including model download)
Resources
- Architecture:
AGENT_ARCHITECTURE.md - Quick Start:
QUICK_START_AGENTS.md - Solution Overview:
SOLUTION_SUMMARY.md - Test Results:
TEST_RESULTS.md
Questions?
The agent architecture solves:
- β Python version conflicts
- β Dependency hell
- β Scalability issues
- β Deployment complexity
You're implementing the same pattern used by OpenAI, Hugging Face, and Stability AI!
Ready? Let's forge some audio! π΅
cd agents\music
py -3.11 -m venv venv
.\venv\Scripts\activate
pip install torch==2.1.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt
python main.py