--- title: SmolLM2 Customs ADI emoji: 🤖 colorFrom: indigo colorTo: blue sdk: docker pinned: true short_description: DEMO — Build your own free LLM service --- # SmolLM2 Customs — Build Your Own LLM Service > A showcase: how to build a free, private, OpenAI-compatible LLM service on HuggingFace Spaces and plug it into any hub or application — no GPU, no money, no drama. > [!IMPORTANT] > This project is under active development — always use the latest release from [Codey Lab](https://github.com/Codey-LAB/SmolLM2-customs) *(more stable builds land there first)*. > This repo ([DEV-STATUS](https://github.com/VolkanSah/SmolLM2-ADI)) is where the chaos happens. 🔬 A ⭐ on the repos would be cool 😙 --- ## What is this? A minimal but production-ready LLM service built on: - **SmolLM2-360M-Instruct** — 269MB, Apache 2.0, runs on 2 CPUs for free - **FastAPI** — OpenAI-compatible `/v1/chat/completions` endpoint - **ADI** (Anti-Dump Index) — filters low-quality requests before they hit the model - **HF Dataset** — logs every request for later analysis and finetuning The point is not the model — the point is the pattern. Fork it, swap SmolLM2 for any model you want, and you have your own private LLM API running for free. --- ## How it works ``` Request ↓ ADI Score (is this request worth answering?) ↓ REJECT → returns improvement suggestions, logs to dataset MEDIUM/HIGH → SmolLM2 answers, logs to dataset SmolLM2 fails → returns 503 → hub fallback chain kicks in ``` --- ## Endpoints ``` GET / → status GET /v1/health → health check POST /v1/chat/completions → OpenAI-compatible inference ``` --- ## Plug into any Hub (one config block) Works out of the box with [Multi-LLM-API-Gateway](https://github.com/VolkanSah/Multi-LLM-API-Gateway): Hub Screenshot for this [SmolLM2](SmolLM2.jpg) ```ini [LLM_PROVIDER.smollm] active = "true" base_url = "https://YOUR-USERNAME-smollm2-customs.hf.space/v1" env_key = "SMOLLM_API_KEY" default_model = "smollm2-360m" models = "smollm2-360m, YOUR-USERNAME/your-finetuned-model" fallback_to = "gemini" [LLM_PROVIDER.smollm_END] ``` Any OpenAI-compatible client works the same way. --- ## Secrets (HF Space Settings) | Secret | Required | Description | |--------|----------|-------------| | `SMOLLM_API_KEY` | recommended | Locks the endpoint — set same value in your hub | | `HF_TOKEN` or `TEST_TOKEN` | optional | HF auth for dataset + model repo access | | `MODEL_REPO` | optional | Base model override (default: `HuggingFaceTB/SmolLM2-360M-Instruct`) | | `DATASET_REPO` | optional | Your private HF dataset for logging | | `PRIVATE_MODEL_REPO` | optional | Your private model repo for finetuned weights | **Auth modes:** ``` SMOLLM_API_KEY not set → open access (demo/showcase mode) SMOLLM_API_KEY set → protected (production mode) Space private → double protection (HF gate + your key) ``` --- ## ADI Routing | Decision | Action | |----------|--------| | `HIGH_PRIORITY` | SmolLM2 handles it | | `MEDIUM_PRIORITY` | SmolLM2 handles it | | `REJECT` | Returns suggestions, logs to dataset | | SmolLM2 fails | 503 → hub fallback chain | --- ## Training Utilities Every request is logged to your private HF dataset. Use it to improve over time: ```bash python train.py --mode export # export dataset → JSONL python train.py --mode validate # validate ADI weights against labeled data python train.py --mode finetune # finetune SmolLM2 on your data (coming soon) ``` Once you have enough data → finetune → push to your private model repo → Space loads it automatically next restart. --- ## Stack | Component | What it does | |-----------|-------------| | `main.py` | FastAPI, auth, routing | | `smollm.py` | Inference engine, lazy loading | | `model.py` | HF token resolution, dataset + model repo access | | `adi.py` | Request quality scoring | | `train.py` | Dataset export, ADI validation, finetuning | --- ## Part of - [Multi-LLM-API-Gateway](https://github.com/VolkanSah/Multi-LLM-API-Gateway) — the hub this was built for - [Anti-Dump-Index](https://github.com/VolkanSah/Anti-Dump-Index) — the ADI algorithm idea ## License Dual-licensed: - [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) - [Ethical Security Operations License v1.1 (ESOL)](ESOL) — mandatory, non-severable By using this software you agree to all ethical constraints defined in ESOL v1.1.