LH-Tech-AI commited on
Commit
e3d50ed
·
verified ·
1 Parent(s): 03cfb64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -21,7 +21,7 @@ tags:
21
 
22
  # 🦅 Supra-50M
23
 
24
- **Supra-50M** is a compact 50M-parameter causal language model built by SupraLabs, trained from scratch using a Llama-style architecture on 20 billion tokens of high-quality educational web text. Despite being significantly smaller than comparable open models, it achieves competitive or superior results on several key benchmarks. It is our first Supra Scalling Up plan model.
25
 
26
  ---
27
 
@@ -34,13 +34,13 @@ Supra-50M outperforms much larger models — GPT-2 Small (124M), SmolLM-135M, an
34
  ### Benchmark Table
35
 
36
  | Benchmark | Supra-50M *(ours)* | GPT-2 (124M) | SmolLM-135M | OpenELM-270M |
37
- |---|---|---|---|---|
38
- | **Parameters** | **50M** | 124M *(2.5× larger)* | 135M *(2.7× larger)* | 270M *(5.4× larger)* |
39
- | BLiMP (linguistics) | **76.3%** | ~63.0% | ~75.2% | ~68.0% |
40
- | SciQ (science) | **77.2%** | ~52.0% | ~74.5% | ~61.0% |
41
- | ARC-Easy (knowledge) | 52.2% | ~42.0% | **~55.0%** | ~46.0% |
42
- | PIQA (logic) | 62.2% | ~61.0% | **~63.3%** | ~60.5% |
43
- | HellaSwag (context) | 31.8% | ~31.0% | **~34.0%** | ~28.0% |
44
 
45
  ---
46
 
 
21
 
22
  # 🦅 Supra-50M
23
 
24
+ **Supra-50M** is a compact 50M-parameter causal language model built by SupraLabs, trained from scratch using a Llama-style architecture on 20 billion tokens of high-quality educational web text. Despite being significantly smaller than comparable open models, it achieves competitive or superior results on several key benchmarks.
25
 
26
  ---
27
 
 
34
  ### Benchmark Table
35
 
36
  | Benchmark | Supra-50M *(ours)* | GPT-2 (124M) | SmolLM-135M | OpenELM-270M |
37
+ | :--- | :--- | :--- | :--- | :--- |
38
+ | **Parameters** | **50M** | 124M *(2.5×)* | 135M *(2.7×)* | 270M *(5.4×)* |
39
+ | **BLiMP** (linguistics) | **76.3%** | 63.0% | **69.8%** | *(k.A.)* |
40
+ | **SciQ** (science) | 77.2% | 53.2% | 73.4% | **84.70%** |
41
+ | **ARC-Easy** (knowledge) | 52.2% | 42.0% | 49.2% | **45.08%** |
42
+ | **PIQA** (logic) | 62.2% | 63.0% | 67.3% | **69.75%** |
43
+ | **HellaSwag** (context) | 31.8% | 29.5% | 42.0% | **46.71%** |
44
 
45
  ---
46