AudioForge / GEMINI.md
OnyxlMunkey's picture
c618549

AudioForge

Project Overview

AudioForge is a production-ready, fully open-source text-to-music generation platform inspired by Suno. It leverages a multi-stage pipeline to generate high-quality audio from text descriptions, including instrumental tracks and vocals.

Key Features

  • Multi-stage Pipeline: Prompt understanding, music generation (MusicGen), vocal generation (Bark/XTTS), and post-processing.
  • Modern Stack: FastAPI backend, Next.js 14 frontend, PostgreSQL, Redis.
  • Observability: Structured logging, Prometheus metrics, OpenTelemetry tracing.
  • UI/UX: Beautiful, responsive UI built with Tailwind CSS and Radix UI.

Tech Stack

  • Frontend: Next.js 14, TypeScript, Tailwind CSS, Radix UI, React Query.
  • Backend: FastAPI (Python 3.11+), Pydantic, SQLAlchemy, AsyncPG.
  • Database: PostgreSQL 16+.
  • Caching: Redis 7+.
  • ML/AI: PyTorch, AudioCraft (MusicGen), Bark, Transformers.
  • Infrastructure: Docker, Docker Compose.

Getting Started

Prerequisites

  • Docker & Docker Compose (Recommended)
  • Python 3.11+
  • Node.js 20+

Quick Start (Docker)

The easiest way to run the entire stack is via Docker Compose:

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Check status
docker-compose ps
  • Frontend: http://localhost:3000
  • Backend API: http://localhost:8000
  • API Docs: http://localhost:8000/api/docs

Manual Setup

Backend:

cd backend
python scripts/quick_setup.py
python scripts/init_db.py
uvicorn app.main:app --reload

Frontend:

cd frontend
pnpm install
# Create .env.local with NEXT_PUBLIC_API_URL=http://localhost:8000
pnpm dev

Key Commands

Backend (backend/)

Action Command
Setup python scripts/quick_setup.py
Run Dev Server uvicorn app.main:app --reload
Run Tests pytest
Init Database python scripts/init_db.py
Verify Setup python scripts/verify_setup.py

Frontend (frontend/)

Action Command
Install Dependencies pnpm install
Run Dev Server pnpm dev
Build pnpm build
Test pnpm test (Vitest)
Lint pnpm lint
Type Check pnpm type-check

Global Scripts (scripts/)

  • launch.ps1 / launch.sh: Master launch script.
  • check_status.bat: Checks the health of services .
  • install_ml_dependencies.ps1: Installs heavy ML libraries.

Architecture

The system follows a microservices-inspired architecture:

  • backend/: The core API service handling request coordination, DB interactions, and pipeline orchestration.
  • frontend/: The user interface for prompting and playback.
  • agents/: Specialized Python agents for distinct generation tasks (Music, Vocals, Processing).
  • storage/: Local storage for generated audio files.

Generation Pipeline

  1. Prompt Understanding: Analyzes text for style, mood, tempo.
  2. Music Generation: Generates instrumental track using MusicGen.
  3. Vocal Generation: (Optional) Generates vocals from lyrics using Bark/XTTS.
  4. Mixing & Mastering: Combines tracks and applies audio effects.

Development Conventions

  • Python:
    • Use Type Hints strictly.
    • Adhere to Async/Await patterns for I/O bound operations.
    • Use Pydantic models for data validation and schemas.
    • Follow PEP 8 style (enforced by Black/Ruff).
  • Frontend:
    • TypeScript strictly typed (avoid any).
    • React Query for server state management.
    • Tailwind CSS for styling.
    • Radix UI for accessible components.
  • Testing:
    • Write tests for new features (pytest for backend, vitest for frontend).
    • Ensure strict test coverage.