Spaces:

specimba
/

nexus-os-space

Running

App Files Files Community

specimba commited on 6 days ago

Commit

42bc228

verified ·

1 Parent(s): 8c867a4

Update README: HF Inference API is now primary backend

Browse files

Files changed (1) hide show

README.md +39 -9

README.md CHANGED Viewed

@@ -11,21 +11,51 @@ tags:
 - ml-intern
 ---
-# NEXUS OS v2.1 — Thermodynamic LLM Control System
-Hybrid Cloud + Local Inference with BEC Thermodynamic Hallucination Control.
 ## Features
-- **37+ real models** including Nemotron-3 Nano-Omni 30B and OpenSonnet-Lite-MAX
-- **Ollama relay** — connect your local Ollama via ngrok/tunnel
-- **Cloud API fallback** — DeepSeek, Claude, GPT-5, Qwen, Kimi, GLM
 - **4 hallucination detectors** (EPR, Spilled Energy, CK-PLUG, TWAVE)
 - **Novel composite signals**: EEP, PTI, NEWI
-## Setup
-1. Expose your local Ollama: `ngrok http 11434`
-2. Set `OLLAMA_RELAY_URL` in Space secrets
-3. Add cloud API keys as needed
 ## Repository
 [specimba/nexus-os-v2](https://huggingface.co/datasets/specimba/nexus-os-v2)

 - ml-intern
 ---
+# NEXUS OS v2.1 — Real LLM Inference via HF API
+**Primary backend: HF Inference API** (free tier, works immediately)
+This Space provides GENUINE model inference without GPU access, ngrok tunnels, or paid cloud APIs.
+## How It Works
+### 1. HF Inference API (Primary — No Setup Needed)
+- Uses your HF token (already active in Spaces)
+- Free tier: $0.10/month credits (~100-500 requests)
+- Models: SmolLM2-1.7B, Llama-3.2-1B, Qwen2.5-0.5B, Gemma-2-2B, Phi-4-mini
+- Just enter a prompt and click Generate — real inference immediately
+### 2. Ollama Relay (Optional — Your Local Models)
+- Expose your local Ollama: `ngrok http 11434`
+- Set `OLLAMA_RELAY_URL` in Space secrets
+- Access your 37+ local models through the Space
+### 3. Cloud API Fallback (Optional — Paid Providers)
+- DeepSeek, Claude, GPT-5, Qwen, Kimi, GLM
+- Add API keys to Space secrets
+- Used when HF Inference API and Ollama are unavailable
+### 4. Mock Mode (Last Resort)
+- Simulated responses with full telemetry
+- Useful for testing the UI without any backends
 ## Features
+- **37+ real models** in registry including Nemotron-3 Nano-Omni 30B and OpenSonnet-Lite-MAX
 - **4 hallucination detectors** (EPR, Spilled Energy, CK-PLUG, TWAVE)
 - **Novel composite signals**: EEP, PTI, NEWI
+- **Per-token thermodynamic telemetry** with risk scoring
+- **VRAM-aware model filtering** — only shows models that fit your budget
+## Quick Start
+1. Open the Space
+2. Enter a prompt
+3. Click "Generate with NEXUS OS"
+4. Get real inference + thermodynamic risk analysis
 ## Repository
 [specimba/nexus-os-v2](https://huggingface.co/datasets/specimba/nexus-os-v2)
+## Troubleshooting
+- **"HF Inference API unavailable"**: Your HF token may have exhausted free credits. The Space will fallback to mock mode.
+- **"Ollama relay unreachable"**: Check your ngrok tunnel is active and the URL is correct in Space secrets.
+- **"Cloud API failed"**: Ensure API keys are added as Space secrets (not hardcoded).