AxionLab-official commited on
Commit
a801075
·
verified ·
1 Parent(s): 0eb1c3c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -2
README.md CHANGED
@@ -24,7 +24,20 @@ tags:
24
 
25
  **DistillSupra-0.2M** is an ultra-compact causal language model with approximately **0.2 million parameters**, produced by knowledge distillation from [Supra-Mini-v4-2M](https://huggingface.co/SupraLabs/Supra-Mini-v4-2M).
26
 
27
- It was trained 500 steps for 30 minutes on a GTX 750 Ti 4GB using generated text from the teacher.
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## Some outputs:
30
 
@@ -38,4 +51,12 @@ Output: The human brain is capable ofs in an more that in a new can is the this
38
 
39
  Prompt : The most important principle in science is
40
  --------------------------------------------------
41
- The most important principle in science is a is a this are not for that the to of be digels-LC. to the in a the to, on to,
 
 
 
 
 
 
 
 
 
24
 
25
  **DistillSupra-0.2M** is an ultra-compact causal language model with approximately **0.2 million parameters**, produced by knowledge distillation from [Supra-Mini-v4-2M](https://huggingface.co/SupraLabs/Supra-Mini-v4-2M).
26
 
27
+ It was trained 500 steps(1 Epoch) for 30 minutes on a GTX 750 Ti 4GB using generated text from the teacher.
28
+
29
+ The model was **10x** compressed! That's crazy!
30
+
31
+ ## Architecture
32
+
33
+ | Parameter | Teacher | Student |
34
+ |---------------------|---------|---------|
35
+ | hidden_size | 64 | 48 |
36
+ | intermediate_size | 128 | 96 |
37
+ | num_hidden_layers | 5 | 4 |
38
+ | num_attention_heads | 8 | 6 |
39
+ | vocab_size | 4096 | 4096 |
40
+ | Parameters | ~468k | ~289k |
41
 
42
  ## Some outputs:
43
 
 
51
 
52
  Prompt : The most important principle in science is
53
  --------------------------------------------------
54
+ The most important principle in science is a is a this are not for that the to of be digels-LC. to the in a the to, on to,
55
+
56
+ ## Why did supra created this trash?
57
+
58
+ We are currently researching knowledge distillation and this was the first step! Things will better up!
59
+
60
+ ## Final Thought
61
+
62
+ Knowledge distillation is a promising thing for us, we believe that LLMs can be helpful even being so small!