File size: 2,407 Bytes
e5f8cc7
3f17e4c
43858dc
 
 
d69402c
db446b8
e5f8cc7
 
5b65a76
 
e5f8cc7
 
3f17e4c
43858dc
3f17e4c
 
42bc228
 
 
3f17e4c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43858dc
 
3f17e4c
 
 
 
 
 
 
 
 
 
43858dc
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
title: NEXUS OS v4.0
emoji: 🔥
colorFrom: red
colorTo: purple
sdk: gradio
sdk_version: 6.14.0
app_file: app.py
pinned: false
tags:
- ml-intern
---

# NEXUS OS v4.0 — Intelligent Multi-Provider Router

**COMPLETELY self-contained** — zero external dependencies except gradio + stdlib.
No torch, no pinecone, no package imports that crash on startup.

## How It Works

### Intelligent Routing (Auto-Detected)
The app queries ALL configured providers in parallel, measures health + latency,
and picks the best one automatically. Falls back through the chain if any fail.

| Priority | Provider | Free Tier | Strength |
|----------|----------|-----------|----------|
| **1** | **HF Inference Providers** | $0.10/mo credits | Auto-routing, single HF token |
| **2** | **Groq** | Generous | Fastest inference (LPU chips) |
| **3** | **DeepSeek** | 5M tokens | Best reasoning models |
| **4** | **OpenRouter** | 25+ free models | Most model variety |
| **5** | **Together AI** | Rate-limited 70B | Large models, slow |
| **6** | **Ollama Relay** | Your local models | Via ngrok tunnel |
| **7** | **Mock** | Always works | Simulated for testing |

### Setup

**No setup needed for mock mode.** To get real inference, add API keys as Space secrets:

| Secret | Provider | Get Key At |
|--------|----------|------------|
| `HF_TOKEN` | HF Inference Providers | Already active in Spaces |
| `GROQ_API_KEY` | Groq | https://console.groq.com |
| `DEEPSEEK_API_KEY` | DeepSeek | https://platform.deepseek.com |
| `OPENROUTER_API_KEY` | OpenRouter | https://openrouter.ai |
| `TOGETHER_API_KEY` | Together AI | https://api.together.xyz |
| `OLLAMA_RELAY_URL` | Your local Ollama | `ngrok http 11434` |

## Features
- **37+ real models** in registry
- **Thermodynamic telemetry**: EEP, PTI, NEWI hallucination signals
- **VRAM-aware filtering**: only shows models that fit your budget
- **Per-token risk scoring**: hallucination detection simulation

## What's New in v4.0
- **Self-contained**: no `nexus_os_v2/` imports, no torch/pinecone dependencies
- **5 real providers**: HF Router, Groq, DeepSeek, OpenRouter, Together AI
- **Removed**: Kilocode (IDE plugin), OpenCode (IDE plugin), NVIDIA NIM (trial only), Fireworks ($1 credit)
- **Intelligent routing**: parallel health checks, capability-based model selection

## Repository
[specimba/nexus-os-v2](https://huggingface.co/datasets/specimba/nexus-os-v2)