File size: 1,857 Bytes
7f19c5f 80fe33b 9e2b21f 561b851 9e2b21f 561b851 9e2b21f 561b851 9e2b21f 561b851 14dc63b 561b851 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | ---
license: apache-2.0
datasets:
- HuggingFaceFW/fineweb
language:
- en
pipeline_tag: text-generation
tags:
- tiny-model
- cinnabarlm
- tiny-llm
- tiny-lm
- tinylm
- tinyllm
new_version: MihaiPopa-1/CinnabarLM-4M-Base
---
# CinnabarLM
CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on Colab)! It's only 16 MB in size!
# Why?
Because it's a good idea to make tiny LLMs. Some people already did with [MicroLM](https://huggingface.co/CromIA/MicroLM-1M), [Spark 4 5M](https://huggingface.co/LH-Tech-AI/Spark-5M-Base-v4) and [Tenete 8M](https://huggingface.co/Harley-ml/Tenete-8M), but not myself!
# Model Configurations
| Parameter | Value |
|---|---|
| Tokenizer | Custom BPE tokenizer |
| Vocabulary Size | 4096 tokens |
| Batch Size | 64 |
| Context Window | 256 tokens |
| `n_embed` | 192 |
| `n_head` | 8 |
| `n_layer` | 6 |
| Dropout | 0.1 |
# Training Configurations
| Hyperparameter | Value |
|---|---|
| `max_iters` | 10000 |
| `eval_interval` | 500 |
| `learning_rate` | 6e-4 |
| `min_lr` | 6e-5 |
| `warmup_iters` | 500 |
| `weight_decay` | 0.1 |
| `beta1, beta2` | 0.9, 0.95 |
# Limitations
* **Not Instruction-Tuned:** It's only a base model, so it only completes text.
* **English-Only:** It's trained on English data (FineWeb), it's NOT multilingual.
* **Not a Standard Model:** It's NOT a Qwen/Llama/GPT model. Standard Transformers can't recognize this!
* **Preview:** This is a preview version, it generates gibberish often. CinnabarLM 1 will solve this with Llama.
# Some other details
* It's trained on 80 million tokens of [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) (CC-MAIN-2025-26 snapshot), and the knowledge cutoff is June 2025.
* The name "CinnabarLM" that I picked was made by combining "Cinnabar" (the new block from the Chaos Cubed drop in Minecraft) + "LM" (Language Model) |