Text Generation
Transformers
Safetensors
English
llama
small
cpu
supra
tiny
mini
open
open-source
Eval Results (legacy)
text-generation-inference
Instructions to use SupraLabs/Supra-Mini-0.1M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SupraLabs/Supra-Mini-0.1M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SupraLabs/Supra-Mini-0.1M")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SupraLabs/Supra-Mini-0.1M") model = AutoModelForCausalLM.from_pretrained("SupraLabs/Supra-Mini-0.1M") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use SupraLabs/Supra-Mini-0.1M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SupraLabs/Supra-Mini-0.1M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SupraLabs/Supra-Mini-0.1M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/SupraLabs/Supra-Mini-0.1M
- SGLang
How to use SupraLabs/Supra-Mini-0.1M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SupraLabs/Supra-Mini-0.1M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SupraLabs/Supra-Mini-0.1M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SupraLabs/Supra-Mini-0.1M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SupraLabs/Supra-Mini-0.1M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use SupraLabs/Supra-Mini-0.1M with Docker Model Runner:
docker model run hf.co/SupraLabs/Supra-Mini-0.1M
Update README.md
Browse files
README.md
CHANGED
|
@@ -19,7 +19,7 @@ tags:
|
|
| 19 |
# 🦅 Supra Mini 0.1M
|
| 20 |
Supra Mini 0.1M is a very tiny base model trained on 500 million tokens of Fineweb-Edu for 2 epochs to prove how well very tiny models can perform on world knowledge.
|
| 21 |
|
| 22 |
-
# Model Config
|
| 23 |
|
| 24 |
- Parameters: 117,648 (0.1M)
|
| 25 |
- Architecture: Llama
|
|
@@ -32,25 +32,28 @@ Supra Mini 0.1M is a very tiny base model trained on 500 million tokens of Finew
|
|
| 32 |
- Learning rate: 6e-4
|
| 33 |
- Weight Decay: 0.01
|
| 34 |
|
|
|
|
|
|
|
|
|
|
| 35 |
## Benchmarks
|
| 36 |
|
| 37 |
All benchmarks were executed using `lm-eval`.
|
| 38 |
|
| 39 |
-
| Task | Value |
|
| 40 |
-
| :------------ | :----------: | ---------: |
|
| 41 |
-
| Arc_Easy | 0.
|
| 42 |
-
| Wikitext |
|
| 43 |
-
| BLiMP |
|
| 44 |
|
| 45 |
## Examples
|
| 46 |
-
**Prompt:**
|
| 47 |
-
**Output:**:
|
| 48 |
<br><br>
|
| 49 |
-
**Prompt:**
|
| 50 |
-
**Output:**:
|
| 51 |
<br><br>
|
| 52 |
-
**Prompt:**
|
| 53 |
-
**Output:**:
|
| 54 |
|
| 55 |
## Usage
|
| 56 |
To use our model, just run this code using HF Transformers to execute the model:
|
|
|
|
| 19 |
# 🦅 Supra Mini 0.1M
|
| 20 |
Supra Mini 0.1M is a very tiny base model trained on 500 million tokens of Fineweb-Edu for 2 epochs to prove how well very tiny models can perform on world knowledge.
|
| 21 |
|
| 22 |
+
## Model Config
|
| 23 |
|
| 24 |
- Parameters: 117,648 (0.1M)
|
| 25 |
- Architecture: Llama
|
|
|
|
| 32 |
- Learning rate: 6e-4
|
| 33 |
- Weight Decay: 0.01
|
| 34 |
|
| 35 |
+
## Final Loss
|
| 36 |
+
This model reached a final train loss after 2 epochs of **x.xxx**.
|
| 37 |
+
|
| 38 |
## Benchmarks
|
| 39 |
|
| 40 |
All benchmarks were executed using `lm-eval`.
|
| 41 |
|
| 42 |
+
| Task | Value | Random level |
|
| 43 |
+
| :------------ | :----------: | -----------: |
|
| 44 |
+
| Arc_Easy | 0.2639 | 0.25 (25%) |
|
| 45 |
+
| Wikitext | 25.1691 | - |
|
| 46 |
+
| BLiMP | 0.5177 | 0.5 (50%) |
|
| 47 |
|
| 48 |
## Examples
|
| 49 |
+
**Prompt:** "Artificial intelligence is "<br>
|
| 50 |
+
**Output:**: "Artificial intelligence is power by the leading the community, the book of the bring and in the made to the production of the back of an installing and consider in the several c"
|
| 51 |
<br><br>
|
| 52 |
+
**Prompt:** "The main concept of physics is "<br>
|
| 53 |
+
**Output:**: "The main concept of physics is a struggle of the development of the company of the solution of the work of the first can be some of the supply a part of the state of the management,"
|
| 54 |
<br><br>
|
| 55 |
+
**Prompt:** "Once upon a time, "<br>
|
| 56 |
+
**Output:**: "Once upon a time, so that he survey which is a self-described by the series of the surgery of the really a policy of the process of the southern of the material the stu"
|
| 57 |
|
| 58 |
## Usage
|
| 59 |
To use our model, just run this code using HF Transformers to execute the model:
|