MihaiPopa-1 commited on
Commit
561b851
·
verified ·
1 Parent(s): 9e2b21f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -2
README.md CHANGED
@@ -15,10 +15,22 @@ tags:
15
  ---
16
 
17
  # CinnabarLM
 
18
 
19
- CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on Colab)!
 
20
 
21
  # Model Configurations
 
 
 
 
 
 
 
 
 
 
22
 
23
  # Training Configurations
24
  | Hyperparameter | Value |
@@ -33,4 +45,9 @@ CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on C
33
 
34
  # Limitations
35
  * **Not Instruction-Tuned:** It's only a base model, so it only completes text.
36
- * **English-Only:** It's trained on English data (FineWeb), it's NOT multilingual.
 
 
 
 
 
 
15
  ---
16
 
17
  # CinnabarLM
18
+ CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on Colab)! It's only 16 MB in size!
19
 
20
+ # Why?
21
+ Because it's a good idea to make tiny LLMs. Some people already did with [MicroLM](https://huggingface.co/CromIA/MicroLM-1M), [Spark 4 5M](https://huggingface.co/LH-Tech-AI/Spark-5M-Base-v4) and [Tenete 8M](https://huggingface.co/Harley-ml/Tenete-8M), but not myself!
22
 
23
  # Model Configurations
24
+ | Parameter | Value |
25
+ |---|---|
26
+ | Tokenizer | Custom BPE tokenizer |
27
+ | Vocabulary Size | 4096 tokens |
28
+ | Batch Size | 64 |
29
+ | Context Window | 256 tokens |
30
+ | `n_embed` | 192 |
31
+ | `n_head` | 8 |
32
+ | `n_layer` | 6 |
33
+ | Dropout | 0.1 |
34
 
35
  # Training Configurations
36
  | Hyperparameter | Value |
 
45
 
46
  # Limitations
47
  * **Not Instruction-Tuned:** It's only a base model, so it only completes text.
48
+ * **English-Only:** It's trained on English data (FineWeb), it's NOT multilingual.
49
+ * **Not a Standard Model:** It's NOT a Qwen/Llama/GPT model. Standard Transformers can't recognize this!
50
+
51
+ # Some other details
52
+ * It's trained on 80 million tokens of [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) (CC-MAIN-2025-26 snapshot), and the knowledge cutoff is June 2025.
53
+ * The name "CinnabarLM" that I picked was made by combining "Cinnabar" (the new block from the Chaos Cubed drop in Minecraft) + "LM" (Language Model)