Harley-ml commited on
Commit
bf1bea9
·
verified ·
1 Parent(s): cc38ce4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -2
README.md CHANGED
@@ -22,7 +22,7 @@ new_version: SupraLabs/Supra-Mini-v5-8M
22
  We apologize for the inconvenience. The correct model should be uploaded soon.
23
 
24
  # 🦅 Supra Mini v4 2M
25
- Supra Mini **v4** 2M is a very tiny base model trained on **3 billion** tokens of Fineweb-Edu for 2 epochs as the **fourth version** of our Supra Mini series.
26
 
27
  ## Model Config
28
 
@@ -39,7 +39,7 @@ Supra Mini **v4** 2M is a very tiny base model trained on **3 billion** tokens o
39
  - Trained in bfloat16
40
 
41
  ## Final Loss
42
- This model reached a final train loss after 2 epochs of **4.618**.
43
 
44
  ## Benchmarks
45
 
@@ -100,6 +100,17 @@ print(f"\nPrompt: {test_prompt}")
100
  print("-" * 30)
101
  print("\nOutput:\n" + generate_text(test_prompt))
102
  ```
 
 
 
 
 
 
 
 
 
 
 
103
 
104
  ## Training guide
105
  We trained Supra Mini v4 2M on a single NVIDIA RTX 5060 Ti 16GB in ~3 hours for 2 epochs.<br>
 
22
  We apologize for the inconvenience. The correct model should be uploaded soon.
23
 
24
  # 🦅 Supra Mini v4 2M
25
+ Supra Mini **v4** 2M is a very small model trained on **3 billion** tokens of Fineweb-Edu for 2 epochs as the **fourth version** of our Supra Mini series.
26
 
27
  ## Model Config
28
 
 
39
  - Trained in bfloat16
40
 
41
  ## Final Loss
42
+ This model reached a final CrossEntropy loss (on the train set) of **4.618**.
43
 
44
  ## Benchmarks
45
 
 
100
  print("-" * 30)
101
  print("\nOutput:\n" + generate_text(test_prompt))
102
  ```
103
+ ## Use cases
104
+
105
+ 1. Educational research
106
+ 2. deployment or testing/fine-tuning on edge environments
107
+ 3. Or more simply, for fun
108
+
109
+ ## Limitations
110
+
111
+ 1. Cannot reason, chat, or code
112
+ 2. Incoherent more often than not
113
+ 3. Mostly unfactual
114
 
115
  ## Training guide
116
  We trained Supra Mini v4 2M on a single NVIDIA RTX 5060 Ti 16GB in ~3 hours for 2 epochs.<br>