Enterprise-AI-Gateway / docs /deployment.md
vn6295337's picture
Initial commit: Enterprise-AI-Gateway - Secure LLM gateway
bb0c63f
# Deployment Guide
> **Primary Responsibility:** Deployment procedures for all environments (local, Docker, cloud)
This guide explains how to deploy the Enterprise AI Gateway in different environments.
## Table of Contents
1. [Prerequisites](#prerequisites)
2. [Local Deployment](#local-deployment)
3. [Docker Deployment](#docker-deployment)
4. [Cloud Deployment](#cloud-deployment)
5. [Production Considerations](#production-considerations)
## Prerequisites
- Docker (for Docker deployment)
- Python 3.8+ (for local deployment)
- Git
- API keys for at least one LLM provider
## Local Deployment
### 1. Clone the Repository
```bash
git clone https://github.com/vn6295337/Enterprise-AI-Gateway.git
cd Enterprise-AI-Gateway
```
### 2. Set Up Environment
```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure environment variables
cp .env.example .env
# Edit .env with your API keys
```
### 3. Run the Application
```bash
uvicorn src.main:app --host 0.0.0.0 --port 8000
```
The application will be available at `http://localhost:8000`.
## Docker Deployment
### 1. Build the Docker Image
```bash
docker build -t llm-secure-gateway .
```
### 2. Run with Environment Variables
```bash
docker run -d \
-e SERVICE_API_KEY=your_service_api_key \
-e GEMINI_API_KEY=your_gemini_api_key \
-e GROQ_API_KEY=your_groq_api_key \
-e OPENROUTER_API_KEY=your_openrouter_api_key \
-p 8000:8000 \
--name llm-gateway \
llm-secure-gateway
```
### 3. Run with Environment File
Create a `.env` file with your configuration, then:
```bash
docker run -d \
--env-file .env \
-p 8000:8000 \
--name llm-gateway \
llm-secure-gateway
```
## Cloud Deployment
### Hugging Face Spaces
1. Create a new Space at [https://huggingface.co/new-space](https://huggingface.co/new-space)
2. Choose "Docker" as the SDK
3. Select a Docker image (e.g., `python:3.11-slim`)
4. Add your repository URL
5. In Space settings, add the following secrets:
- `SERVICE_API_KEY`
- `GEMINI_API_KEY` (optional)
- `GROQ_API_KEY` (optional)
- `OPENROUTER_API_KEY` (optional)
### AWS Deployment
#### Using EC2
1. Launch an EC2 instance with Ubuntu
2. SSH into the instance
3. Install Docker:
```bash
sudo apt update
sudo apt install docker.io -y
sudo systemctl start docker
sudo systemctl enable docker
```
4. Deploy the container:
```bash
sudo docker run -d \
-e SERVICE_API_KEY=your_service_api_key \
-e GEMINI_API_KEY=your_gemini_api_key \
-e GROQ_API_KEY=your_groq_api_key \
-e OPENROUTER_API_KEY=your_openrouter_api_key \
-p 80:8000 \
--name llm-gateway \
llm-secure-gateway
```
#### Using ECS
1. Create an ECS cluster
2. Create a task definition with the container image
3. Configure environment variables in the task definition
4. Create a service to run the task
### Google Cloud Platform
#### Using Compute Engine
1. Create a Compute Engine instance
2. SSH into the instance
3. Install Docker and deploy as above
#### Using Cloud Run
1. Build and push the Docker image to Container Registry
2. Deploy to Cloud Run with environment variables
3. Configure authentication and networking
### Azure
#### Using Virtual Machines
1. Create a VM
2. SSH into the instance
3. Install Docker and deploy as above
#### Using Azure Container Instances
1. Create a container group
2. Specify the image and environment variables
3. Configure networking and authentication
## Production Considerations
### Security
1. **Use HTTPS**: Always deploy with SSL/TLS encryption
2. **Restrict CORS**: Set specific allowed origins instead of `*`
3. **Rotate API Keys**: Regularly rotate service and provider API keys
4. **Monitor Logs**: Set up logging and monitoring
5. **Rate Limiting**: Adjust rate limits based on expected usage
### Performance
1. **Load Balancing**: Use a load balancer for high availability
2. **Auto-scaling**: Configure auto-scaling based on demand
3. **Caching**: Implement caching for frequently requested responses
4. **Database**: Use a production database for storing logs/metrics
### Monitoring
1. **Health Checks**: Implement health checks for load balancers
2. **Metrics**: Collect and monitor performance metrics
3. **Alerts**: Set up alerts for errors and performance issues
4. **Logging**: Centralize logs for debugging and auditing
### Backup and Recovery
1. **Configuration Backup**: Backup environment configurations
2. **Disaster Recovery**: Plan for disaster recovery scenarios
3. **Rollback Strategy**: Have a rollback strategy for deployments
## Environment Configuration
See [Configuration Guide](configuration.md) for complete environment variable reference.
## Troubleshooting
See [Troubleshooting Guide](troubleshooting.md) for detailed help.
**Quick debugging:**
```bash
docker logs llm-gateway # View logs
docker ps # Check running containers
docker exec -it llm-gateway /bin/bash # Access shell
```
## Maintenance
### Updates
To update the application:
1. Pull the latest code or Docker image
2. Update environment variables if needed
3. Restart the service
### Monitoring
Regular monitoring tasks:
1. Check application logs
2. Monitor API usage and costs
3. Verify LLM provider availability
4. Review security logs
## Scaling
### Vertical Scaling
Increase resources allocated to the container/host:
- More CPU
- More memory
- Better network bandwidth
### Horizontal Scaling
Deploy multiple instances behind a load balancer:
- Use sticky sessions if needed
- Share configuration across instances
- Monitor individual instance health