Harley-ml commited on
Commit
08a6bea
·
verified ·
1 Parent(s): 71c55a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -1
README.md CHANGED
@@ -48,7 +48,7 @@ new_version: SupraLabs/Supra-Mini-v4-2M
48
  ---
49
 
50
  # 🦅 Supra Mini 0.1M
51
- Supra Mini 0.1M is a very tiny base model trained on 500 million tokens of Fineweb-Edu for 2 epochs to prove how well very tiny models can perform on world knowledge.
52
 
53
  ## Model Config
54
 
@@ -120,6 +120,18 @@ print("-" * 30)
120
  print("\nOutput:\n" + generate_text(test_prompt))
121
  ```
122
 
 
 
 
 
 
 
 
 
 
 
 
 
123
  ## Training guide
124
  We trained Supra Mini 0.1M on a single T4 GPU in ~45 minutes for 2 epochs.<br>
125
  The full training code can be found in this repo as `run.sh` (easily run the complete pipeline), `train_tokenizer.py` (train costum BPE tokenizer with vocab size of 250), `train.py` (train the model) and `inference.py` (test the model).<br>
 
48
  ---
49
 
50
  # 🦅 Supra Mini 0.1M
51
+ Supra Mini 0.1M is a very small, yes, very small base model trained on 500 million tokens of Fineweb-Edu for 2 epochs to prove how well very tiny models can perform on world knowledge.
52
 
53
  ## Model Config
54
 
 
120
  print("\nOutput:\n" + generate_text(test_prompt))
121
  ```
122
 
123
+ ## Use cases
124
+
125
+ 1. Educational research
126
+ 2. deployment or testing/fine-tuning on edge environments
127
+ 3. Or more simply, for fun
128
+
129
+ ## Limitations
130
+
131
+ 1. Cannot reason, chat, or code
132
+ 2. Incoherent more often than not
133
+ 3. Mostly unfactual
134
+
135
  ## Training guide
136
  We trained Supra Mini 0.1M on a single T4 GPU in ~45 minutes for 2 epochs.<br>
137
  The full training code can be found in this repo as `run.sh` (easily run the complete pipeline), `train_tokenizer.py` (train costum BPE tokenizer with vocab size of 250), `train.py` (train the model) and `inference.py` (test the model).<br>