Spaces:
Running
Running
metadata
title: NEXUS OS v4.0
emoji: 🔥
colorFrom: red
colorTo: purple
sdk: gradio
sdk_version: 6.14.0
app_file: app.py
pinned: false
tags:
- ml-intern
NEXUS OS v4.0 — Intelligent Multi-Provider Router
COMPLETELY self-contained — zero external dependencies except gradio + stdlib. No torch, no pinecone, no package imports that crash on startup.
How It Works
Intelligent Routing (Auto-Detected)
The app queries ALL configured providers in parallel, measures health + latency, and picks the best one automatically. Falls back through the chain if any fail.
| Priority | Provider | Free Tier | Strength |
|---|---|---|---|
| 1 | HF Inference Providers | $0.10/mo credits | Auto-routing, single HF token |
| 2 | Groq | Generous | Fastest inference (LPU chips) |
| 3 | DeepSeek | 5M tokens | Best reasoning models |
| 4 | OpenRouter | 25+ free models | Most model variety |
| 5 | Together AI | Rate-limited 70B | Large models, slow |
| 6 | Ollama Relay | Your local models | Via ngrok tunnel |
| 7 | Mock | Always works | Simulated for testing |
Setup
No setup needed for mock mode. To get real inference, add API keys as Space secrets:
| Secret | Provider | Get Key At |
|---|---|---|
HF_TOKEN |
HF Inference Providers | Already active in Spaces |
GROQ_API_KEY |
Groq | https://console.groq.com |
DEEPSEEK_API_KEY |
DeepSeek | https://platform.deepseek.com |
OPENROUTER_API_KEY |
OpenRouter | https://openrouter.ai |
TOGETHER_API_KEY |
Together AI | https://api.together.xyz |
OLLAMA_RELAY_URL |
Your local Ollama | ngrok http 11434 |
Features
- 37+ real models in registry
- Thermodynamic telemetry: EEP, PTI, NEWI hallucination signals
- VRAM-aware filtering: only shows models that fit your budget
- Per-token risk scoring: hallucination detection simulation
What's New in v4.0
- Self-contained: no
nexus_os_v2/imports, no torch/pinecone dependencies - 5 real providers: HF Router, Groq, DeepSeek, OpenRouter, Together AI
- Removed: Kilocode (IDE plugin), OpenCode (IDE plugin), NVIDIA NIM (trial only), Fireworks ($1 credit)
- Intelligent routing: parallel health checks, capability-based model selection