Spaces:
Sleeping
Troubleshooting Guide
Primary Responsibility: Problem diagnosis and resolution for all issue types
This guide helps diagnose and resolve common issues with the Enterprise AI Gateway.
Table of Contents
- Health Check Issues
- Authentication Problems
- API Request Errors
- LLM Provider Issues
- Performance Problems
- Deployment Issues
- Security Concerns
Health Check Issues
Service Unreachable
Symptoms:
/healthendpoint returns 502, 503, or connection timeout- Application doesn't start
Possible Causes:
- Application not running
- Port binding issues
- Firewall/network restrictions
- Insufficient system resources
Solutions:
- Check if the application process is running:
ps aux | grep uvicorn - Verify port binding:
netstat -tlnp | grep :8000 - Check application logs for startup errors
- Ensure firewall allows traffic on the application port
Health Status Unhealthy
Symptoms:
/healthreturns status "unhealthy"- Provider field is null or missing
Possible Causes:
- Missing or invalid LLM provider API keys
- Misconfigured environment variables
- Provider service unavailable
Solutions:
- Verify environment variables are set correctly:
cat .env - Check that at least one LLM provider API key is configured
- Test API keys with provider's API directly
- Check provider status pages for service outages
Authentication Problems
401 Unauthorized Errors
Symptoms:
- All API requests except
/healthreturn 401 - Error message: "Invalid or missing API key"
Possible Causes:
- Missing
X-API-Keyheader - Invalid API key value
SERVICE_API_KEYenvironment variable not set- API key mismatch between client and server
Solutions:
- Verify the
X-API-Keyheader is included in requests:curl -H "X-API-Key: your_api_key" http://localhost:8000/query - Check that
SERVICE_API_KEYis set in environment:echo $SERVICE_API_KEY - Ensure API key values match between client and server
- Regenerate API key if it may have been compromised
API Key Rejected Despite Being Correct
Symptoms:
- Valid API key is rejected
- Works intermittently
Possible Causes:
- Timing attacks prevention causing delays
- Character encoding issues
- Whitespace in API key
Solutions:
- Strip whitespace from API key:
# Remove any trailing/leading whitespace SERVICE_API_KEY=$(echo "$SERVICE_API_KEY" | tr -d ' \t\n\r') - Ensure consistent character encoding (UTF-8)
- Regenerate API key with alphanumeric characters only
API Request Errors
422 Validation Errors
Symptoms:
- Requests return 422 with validation error messages
- Specific field errors in response
Possible Causes:
- Prompt too short or too long
- Invalid
max_tokensvalue - Invalid
temperaturevalue - Prompt injection detected
Solutions:
- Check prompt length (1-4000 characters)
- Verify
max_tokensis between 1-2048 - Verify
temperatureis between 0.0-2.0 - Review prompt for injection patterns like "ignore previous instructions"
429 Rate Limit Exceeded
Symptoms:
- Requests return 429 status code
- Error message: "Rate limit exceeded"
Possible Causes:
- Too many requests from the same IP within the time window
- Misconfigured rate limit settings
- Shared proxy/IP affecting multiple users
Solutions:
- Reduce request frequency to stay within limits
- Increase rate limit in configuration:
RATE_LIMIT=20/minute - Implement exponential backoff in client code
- Use different IP addresses or API keys for different clients
500 Internal Server Errors
Symptoms:
- Requests return 500 with generic error messages
- "All LLM providers failed" error
Possible Causes:
- All configured LLM providers are unavailable
- Network connectivity issues
- Provider API key issues
- Application bugs
Solutions:
- Check LLM provider status pages
- Verify all API keys are valid and have sufficient quotas
- Test network connectivity to provider endpoints
- Check application logs for specific error details
- Try configuring additional LLM providers
LLM Provider Issues
Provider Timeout
Symptoms:
- Slow responses or timeouts
- Fallback to secondary providers
Possible Causes:
- Provider API latency
- Network connectivity issues
- Provider rate limits exceeded
- Geographic distance from provider
Solutions:
- Check provider status dashboards
- Verify network connectivity:
ping generativelanguage.googleapis.com - Review provider rate limits and quotas
- Consider using providers geographically closer to your deployment
Provider Returns Empty Response
Symptoms:
- Valid responses with empty content
- Provider used but no text returned
Possible Causes:
- Provider API response format changed
- Content filtering blocking response
- Invalid request parameters
Solutions:
- Check provider documentation for response format changes
- Review content moderation settings
- Verify request parameters are within acceptable ranges
- Test with provider's API directly using same parameters
Provider Quota Exhausted
Symptoms:
- Sudden increase in errors from specific provider
- Provider-specific error messages about quotas
Possible Causes:
- Exceeded free tier limits
- Reached paid quota limits
- Billing issues with provider
Solutions:
- Check provider dashboard for quota usage
- Upgrade to paid tier if using free tier
- Verify billing information with provider
- Distribute load across multiple providers
Performance Problems
Slow Response Times
Symptoms:
- High latency in API responses
- User experience degradation
Possible Causes:
- Slow LLM provider responses
- Network latency
- Insufficient server resources
- Concurrent request overload
Solutions:
- Monitor provider response times individually
- Optimize network routing
- Scale server resources (CPU, memory)
- Implement caching for common requests
- Use faster LLM providers when possible
High Memory Usage
Symptoms:
- Application crashes with out-of-memory errors
- System slowdown
Possible Causes:
- Memory leaks in application
- Large response payloads
- Too many concurrent requests
Solutions:
- Monitor memory usage over time
- Implement response size limits
- Add memory limits to container configuration
- Scale horizontally with multiple instances
Deployment Issues
Docker Container Won't Start
Symptoms:
- Container exits immediately
- Error messages in docker logs
Possible Causes:
- Missing environment variables
- Port conflicts
- Incorrect image tag
- Insufficient permissions
Solutions:
- Check container logs:
docker logs container_name - Verify all required environment variables are set
- Check for port conflicts:
docker run -p 8001:8000 ... # Use different port - Ensure proper permissions for mounted volumes
Environment Variables Not Loaded
Symptoms:
- Configuration values not applied
- Default values used instead
Possible Causes:
- Incorrect .env file format
- Environment file not mounted properly
- Variable names don't match expected names
Solutions:
- Verify .env file format (no spaces around =):
SERVICE_API_KEY=your_key_here - Check that environment file is properly mounted in Docker:
docker run --env-file .env ... - Confirm variable names match documentation
AI Safety Issues
Content Blocked Unexpectedly
Symptoms:
- Safe prompts are being blocked
- False positives from AI safety check
Possible Causes:
- Toxicity threshold too low
- Edge case in Gemini classification
- Prompt contains keywords triggering false positives
Solutions:
- Increase toxicity threshold:
TOXICITY_THRESHOLD=0.8 # Default is 0.7 - Check which category triggered the block in the response
- Review the prompt for unintended keywords
- Test the prompt directly with
/check-toxicityendpoint
Gemini Safety API Errors
Symptoms:
- Errors mentioning Gemini API
- Safety check falling back to Lakera
Possible Causes:
- Invalid or expired GEMINI_API_KEY
- Gemini API quota exhausted
- Network connectivity issues
Solutions:
- Verify GEMINI_API_KEY is valid:
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GEMINI_API_KEY" \ -H "Content-Type: application/json" \ -d '{"contents":[{"parts":[{"text":"Hello"}]}]}' - Check Gemini API quota in Google Cloud Console
- Add LAKERA_API_KEY for reliable fallback
Lakera Guard Fallback Issues
Symptoms:
- Both Gemini and Lakera failing
- No safety check available
Possible Causes:
- LAKERA_API_KEY not configured
- Lakera API key invalid
- Both services experiencing outages
Solutions:
- Add LAKERA_API_KEY for fallback:
LAKERA_API_KEY=your_lakera_key - Test Lakera key directly:
curl -X POST "https://api.lakera.ai/v2/guard" \ -H "Authorization: Bearer $LAKERA_API_KEY" \ -H "Content-Type: application/json" \ -d '{"messages":[{"content":"test","role":"user"}]}' - Check Lakera status page for outages
Harmful Content Not Being Blocked
Symptoms:
- Harmful prompts passing safety checks
- AI generating inappropriate content
Possible Causes:
- Toxicity threshold too high
- Gemini API not properly configured
- Prompt using evasion techniques
Solutions:
- Lower toxicity threshold:
TOXICITY_THRESHOLD=0.5 - Verify GEMINI_API_KEY is set correctly
- Add LAKERA_API_KEY for additional detection
- Review and update prompt injection patterns
Security Concerns
Suspicious Activity Detected
Symptoms:
- Unexpected traffic patterns
- High rate of blocked requests
- Unusual API usage
Possible Causes:
- Automated scanning/bot activity
- Compromised API keys
- Misconfigured rate limiting
Solutions:
- Review access logs for suspicious patterns
- Rotate potentially compromised API keys
- Implement IP whitelisting if appropriate
- Add more restrictive rate limiting
Prompt Injection Attempts
Symptoms:
- High number of requests with injection patterns
- Blocked requests with injection warnings
Possible Causes:
- Malicious users attempting to bypass security
- Legitimate users inadvertently triggering filters
- Overly aggressive injection detection
Solutions:
- Review blocked prompts to identify false positives
- Fine-tune injection detection patterns if needed
- Implement additional security layers
- Monitor for patterns in attack attempts
Getting Additional Help
If you're unable to resolve an issue:
- Check the GitHub Issues for similar problems
- Review application logs for detailed error messages
- Ensure you're using the latest version of the application
- Contact the development team with:
- Detailed description of the problem
- Steps to reproduce
- Relevant log excerpts
- Environment information