Enterprise-AI-Gateway / docs /deployment.md
vn6295337's picture
Initial commit: Enterprise-AI-Gateway - Secure LLM gateway
bb0c63f

Deployment Guide

Primary Responsibility: Deployment procedures for all environments (local, Docker, cloud)

This guide explains how to deploy the Enterprise AI Gateway in different environments.

Table of Contents

  1. Prerequisites
  2. Local Deployment
  3. Docker Deployment
  4. Cloud Deployment
  5. Production Considerations

Prerequisites

  • Docker (for Docker deployment)
  • Python 3.8+ (for local deployment)
  • Git
  • API keys for at least one LLM provider

Local Deployment

1. Clone the Repository

git clone https://github.com/vn6295337/Enterprise-AI-Gateway.git
cd Enterprise-AI-Gateway

2. Set Up Environment

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env with your API keys

3. Run the Application

uvicorn src.main:app --host 0.0.0.0 --port 8000

The application will be available at http://localhost:8000.

Docker Deployment

1. Build the Docker Image

docker build -t llm-secure-gateway .

2. Run with Environment Variables

docker run -d \
  -e SERVICE_API_KEY=your_service_api_key \
  -e GEMINI_API_KEY=your_gemini_api_key \
  -e GROQ_API_KEY=your_groq_api_key \
  -e OPENROUTER_API_KEY=your_openrouter_api_key \
  -p 8000:8000 \
  --name llm-gateway \
  llm-secure-gateway

3. Run with Environment File

Create a .env file with your configuration, then:

docker run -d \
  --env-file .env \
  -p 8000:8000 \
  --name llm-gateway \
  llm-secure-gateway

Cloud Deployment

Hugging Face Spaces

  1. Create a new Space at https://huggingface.co/new-space
  2. Choose "Docker" as the SDK
  3. Select a Docker image (e.g., python:3.11-slim)
  4. Add your repository URL
  5. In Space settings, add the following secrets:
    • SERVICE_API_KEY
    • GEMINI_API_KEY (optional)
    • GROQ_API_KEY (optional)
    • OPENROUTER_API_KEY (optional)

AWS Deployment

Using EC2

  1. Launch an EC2 instance with Ubuntu
  2. SSH into the instance
  3. Install Docker:
sudo apt update
sudo apt install docker.io -y
sudo systemctl start docker
sudo systemctl enable docker
  1. Deploy the container:
sudo docker run -d \
  -e SERVICE_API_KEY=your_service_api_key \
  -e GEMINI_API_KEY=your_gemini_api_key \
  -e GROQ_API_KEY=your_groq_api_key \
  -e OPENROUTER_API_KEY=your_openrouter_api_key \
  -p 80:8000 \
  --name llm-gateway \
  llm-secure-gateway

Using ECS

  1. Create an ECS cluster
  2. Create a task definition with the container image
  3. Configure environment variables in the task definition
  4. Create a service to run the task

Google Cloud Platform

Using Compute Engine

  1. Create a Compute Engine instance
  2. SSH into the instance
  3. Install Docker and deploy as above

Using Cloud Run

  1. Build and push the Docker image to Container Registry
  2. Deploy to Cloud Run with environment variables
  3. Configure authentication and networking

Azure

Using Virtual Machines

  1. Create a VM
  2. SSH into the instance
  3. Install Docker and deploy as above

Using Azure Container Instances

  1. Create a container group
  2. Specify the image and environment variables
  3. Configure networking and authentication

Production Considerations

Security

  1. Use HTTPS: Always deploy with SSL/TLS encryption
  2. Restrict CORS: Set specific allowed origins instead of *
  3. Rotate API Keys: Regularly rotate service and provider API keys
  4. Monitor Logs: Set up logging and monitoring
  5. Rate Limiting: Adjust rate limits based on expected usage

Performance

  1. Load Balancing: Use a load balancer for high availability
  2. Auto-scaling: Configure auto-scaling based on demand
  3. Caching: Implement caching for frequently requested responses
  4. Database: Use a production database for storing logs/metrics

Monitoring

  1. Health Checks: Implement health checks for load balancers
  2. Metrics: Collect and monitor performance metrics
  3. Alerts: Set up alerts for errors and performance issues
  4. Logging: Centralize logs for debugging and auditing

Backup and Recovery

  1. Configuration Backup: Backup environment configurations
  2. Disaster Recovery: Plan for disaster recovery scenarios
  3. Rollback Strategy: Have a rollback strategy for deployments

Environment Configuration

See Configuration Guide for complete environment variable reference.

Troubleshooting

See Troubleshooting Guide for detailed help.

Quick debugging:

docker logs llm-gateway     # View logs
docker ps                   # Check running containers
docker exec -it llm-gateway /bin/bash  # Access shell

Maintenance

Updates

To update the application:

  1. Pull the latest code or Docker image
  2. Update environment variables if needed
  3. Restart the service

Monitoring

Regular monitoring tasks:

  1. Check application logs
  2. Monitor API usage and costs
  3. Verify LLM provider availability
  4. Review security logs

Scaling

Vertical Scaling

Increase resources allocated to the container/host:

  • More CPU
  • More memory
  • Better network bandwidth

Horizontal Scaling

Deploy multiple instances behind a load balancer:

  • Use sticky sessions if needed
  • Share configuration across instances
  • Monitor individual instance health