LH-Tech-AI commited on
Commit
fc97bbd
·
verified ·
1 Parent(s): ff7c5fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -12
README.md CHANGED
@@ -19,7 +19,7 @@ tags:
19
  # 🦅 Supra Mini 0.1M
20
  Supra Mini 0.1M is a very tiny base model trained on 500 million tokens of Fineweb-Edu for 2 epochs to prove how well very tiny models can perform on world knowledge.
21
 
22
- # Model Config
23
 
24
  - Parameters: 117,648 (0.1M)
25
  - Architecture: Llama
@@ -32,25 +32,28 @@ Supra Mini 0.1M is a very tiny base model trained on 500 million tokens of Finew
32
  - Learning rate: 6e-4
33
  - Weight Decay: 0.01
34
 
 
 
 
35
  ## Benchmarks
36
 
37
  All benchmarks were executed using `lm-eval`.
38
 
39
- | Task | Value | Rating |
40
- | :------------ | :----------: | ---------: |
41
- | Arc_Easy | 0.2x | RATING IN WORDS HERE |
42
- | Wikitext | xx | RATING IN WORDS HERE |
43
- | BLiMP | 5x | RATING IN WORDS HERE |
44
 
45
  ## Examples
46
- **Prompt:** PROMPT_HERE<br>
47
- **Output:**: OUTPUT_HERE
48
  <br><br>
49
- **Prompt:** PROMPT_HERE<br>
50
- **Output:**: OUTPUT_HERE
51
  <br><br>
52
- **Prompt:** PROMPT_HERE<br>
53
- **Output:**: OUTPUT_HERE
54
 
55
  ## Usage
56
  To use our model, just run this code using HF Transformers to execute the model:
 
19
  # 🦅 Supra Mini 0.1M
20
  Supra Mini 0.1M is a very tiny base model trained on 500 million tokens of Fineweb-Edu for 2 epochs to prove how well very tiny models can perform on world knowledge.
21
 
22
+ ## Model Config
23
 
24
  - Parameters: 117,648 (0.1M)
25
  - Architecture: Llama
 
32
  - Learning rate: 6e-4
33
  - Weight Decay: 0.01
34
 
35
+ ## Final Loss
36
+ This model reached a final train loss after 2 epochs of **x.xxx**.
37
+
38
  ## Benchmarks
39
 
40
  All benchmarks were executed using `lm-eval`.
41
 
42
+ | Task | Value | Random level |
43
+ | :------------ | :----------: | -----------: |
44
+ | Arc_Easy | 0.2639 | 0.25 (25%) |
45
+ | Wikitext | 25.1691 | - |
46
+ | BLiMP | 0.5177 | 0.5 (50%) |
47
 
48
  ## Examples
49
+ **Prompt:** "Artificial intelligence is "<br>
50
+ **Output:**: "Artificial intelligence is power by the leading the community, the book of the bring and in the made to the production of the back of an installing and consider in the several c"
51
  <br><br>
52
+ **Prompt:** "The main concept of physics is "<br>
53
+ **Output:**: "The main concept of physics is a struggle of the development of the company of the solution of the work of the first can be some of the supply a part of the state of the management,"
54
  <br><br>
55
+ **Prompt:** "Once upon a time, "<br>
56
+ **Output:**: "Once upon a time, so that he survey which is a self-described by the series of the surgery of the really a policy of the process of the southern of the material the stu"
57
 
58
  ## Usage
59
  To use our model, just run this code using HF Transformers to execute the model: