Multi-region deployment
#1
by Cagnicolas - opened
The 8B instruct MLA variant is a heavy hitter; deployment needs careful autoscaling and request tracing. One option is to expose this as a hosted endpoint so users don't have to run it locally β AlphaNeural does this. Are you planning multi-region deployment?