MihaiPopa-1
/

CinnabarLM-4M-Base-Preview

Text Generation

Model card Files Files and versions

MihaiPopa-1 commited on 20 days ago

Commit

561b851

·

verified ·

1 Parent(s): 9e2b21f

Update README.md

Files changed (1) hide show

README.md +19 -2

README.md CHANGED Viewed

@@ -15,10 +15,22 @@ tags:
 ---
 # CinnabarLM
-CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on Colab)!
 # Model Configurations
 # Training Configurations
 | Hyperparameter | Value |
@@ -33,4 +45,9 @@ CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on C
 # Limitations
 * **Not Instruction-Tuned:** It's only a base model, so it only completes text.
-* **English-Only:** It's trained on English data (FineWeb), it's NOT multilingual.

 ---
 # CinnabarLM
+CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on Colab)! It's only 16 MB in size!
+# Why?
+Because it's a good idea to make tiny LLMs. Some people already did with [MicroLM](https://huggingface.co/CromIA/MicroLM-1M), [Spark 4 5M](https://huggingface.co/LH-Tech-AI/Spark-5M-Base-v4) and [Tenete 8M](https://huggingface.co/Harley-ml/Tenete-8M), but not myself!
 # Model Configurations
+| Parameter | Value |
+|---|---|
+| Tokenizer | Custom BPE tokenizer |
+| Vocabulary Size | 4096 tokens |
+| Batch Size | 64 |
+| Context Window | 256 tokens |
+| `n_embed` | 192 |
+| `n_head` | 8 |
+| `n_layer` | 6 |
+| Dropout | 0.1 |
 # Training Configurations
 | Hyperparameter | Value |
 # Limitations
 * **Not Instruction-Tuned:** It's only a base model, so it only completes text.
+* **English-Only:** It's trained on English data (FineWeb), it's NOT multilingual.
+* **Not a Standard Model:** It's NOT a Qwen/Llama/GPT model. Standard Transformers can't recognize this!
+# Some other details
+* It's trained on 80 million tokens of [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) (CC-MAIN-2025-26 snapshot), and the knowledge cutoff is June 2025.
+* The name "CinnabarLM" that I picked was made by combining "Cinnabar" (the new block from the Chaos Cubed drop in Minecraft) + "LM" (Language Model)