Spaces:

specimba
/

nexus-os-space

Running

App Files Files Community

specimba commited on 5 days ago

Commit

3f17e4c

verified ·

1 Parent(s): f73414b

v4.0 README: 5 real providers, self-contained, zero heavy deps

Browse files

Files changed (1) hide show

README.md +40 -42

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: NEXUS OS v2.1
 emoji: 🔥
 colorFrom: red
 colorTo: purple
@@ -7,55 +7,53 @@ sdk: gradio
 sdk_version: 6.14.0
 app_file: app.py
 pinned: false
-tags:
-- ml-intern
 ---
-# NEXUS OS v2.1 — Real LLM Inference via HF API
-**Primary backend: HF Inference API** (free tier, works immediately)
-This Space provides GENUINE model inference without GPU access, ngrok tunnels, or paid cloud APIs.
 ## How It Works
-### 1. HF Inference API (Primary — No Setup Needed)
-- Uses your HF token (already active in Spaces)
-- Free tier: $0.10/month credits (~100-500 requests)
-- Models: SmolLM2-1.7B, Llama-3.2-1B, Qwen2.5-0.5B, Gemma-2-2B, Phi-4-mini
-- Just enter a prompt and click Generate — real inference immediately
-### 2. Ollama Relay (Optional — Your Local Models)
-- Expose your local Ollama: `ngrok http 11434`
-- Set `OLLAMA_RELAY_URL` in Space secrets
-- Access your 37+ local models through the Space
-### 3. Cloud API Fallback (Optional — Paid Providers)
-- DeepSeek, Claude, GPT-5, Qwen, Kimi, GLM
-- Add API keys to Space secrets
-- Used when HF Inference API and Ollama are unavailable
-### 4. Mock Mode (Last Resort)
-- Simulated responses with full telemetry
-- Useful for testing the UI without any backends
 ## Features
-- **37+ real models** in registry including Nemotron-3 Nano-Omni 30B and OpenSonnet-Lite-MAX
-- **4 hallucination detectors** (EPR, Spilled Energy, CK-PLUG, TWAVE)
-- **Novel composite signals**: EEP, PTI, NEWI
-- **Per-token thermodynamic telemetry** with risk scoring
-- **VRAM-aware model filtering** — only shows models that fit your budget
-## Quick Start
-1. Open the Space
-2. Enter a prompt
-3. Click "Generate with NEXUS OS"
-4. Get real inference + thermodynamic risk analysis
 ## Repository
 [specimba/nexus-os-v2](https://huggingface.co/datasets/specimba/nexus-os-v2)
-## Troubleshooting
-- **"HF Inference API unavailable"**: Your HF token may have exhausted free credits. The Space will fallback to mock mode.
-- **"Ollama relay unreachable"**: Check your ngrok tunnel is active and the URL is correct in Space secrets.
-- **"Cloud API failed"**: Ensure API keys are added as Space secrets (not hardcoded).

 ---
+title: NEXUS OS v4.0
 emoji: 🔥
 colorFrom: red
 colorTo: purple
 sdk_version: 6.14.0
 app_file: app.py
 pinned: false
 ---
+# NEXUS OS v4.0 — Intelligent Multi-Provider Router
+**COMPLETELY self-contained** — zero external dependencies except gradio + stdlib.
+No torch, no pinecone, no package imports that crash on startup.
 ## How It Works
+### Intelligent Routing (Auto-Detected)
+The app queries ALL configured providers in parallel, measures health + latency,
+and picks the best one automatically. Falls back through the chain if any fail.
+| Priority | Provider | Free Tier | Strength |
+|----------|----------|-----------|----------|
+| **1** | **HF Inference Providers** | $0.10/mo credits | Auto-routing, single HF token |
+| **2** | **Groq** | Generous | Fastest inference (LPU chips) |
+| **3** | **DeepSeek** | 5M tokens | Best reasoning models |
+| **4** | **OpenRouter** | 25+ free models | Most model variety |
+| **5** | **Together AI** | Rate-limited 70B | Large models, slow |
+| **6** | **Ollama Relay** | Your local models | Via ngrok tunnel |
+| **7** | **Mock** | Always works | Simulated for testing |
+### Setup
+**No setup needed for mock mode.** To get real inference, add API keys as Space secrets:
+| Secret | Provider | Get Key At |
+|--------|----------|------------|
+| `HF_TOKEN` | HF Inference Providers | Already active in Spaces |
+| `GROQ_API_KEY` | Groq | https://console.groq.com |
+| `DEEPSEEK_API_KEY` | DeepSeek | https://platform.deepseek.com |
+| `OPENROUTER_API_KEY` | OpenRouter | https://openrouter.ai |
+| `TOGETHER_API_KEY` | Together AI | https://api.together.xyz |
+| `OLLAMA_RELAY_URL` | Your local Ollama | `ngrok http 11434` |
 ## Features
+- **37+ real models** in registry
+- **Thermodynamic telemetry**: EEP, PTI, NEWI hallucination signals
+- **VRAM-aware filtering**: only shows models that fit your budget
+- **Per-token risk scoring**: hallucination detection simulation
+## What's New in v4.0
+- **Self-contained**: no `nexus_os_v2/` imports, no torch/pinecone dependencies
+- **5 real providers**: HF Router, Groq, DeepSeek, OpenRouter, Together AI
+- **Removed**: Kilocode (IDE plugin), OpenCode (IDE plugin), NVIDIA NIM (trial only), Fireworks ($1 credit)
+- **Intelligent routing**: parallel health checks, capability-based model selection
 ## Repository
 [specimba/nexus-os-v2](https://huggingface.co/datasets/specimba/nexus-os-v2)