MihaiPopa-1
/

CinnabarLM-4M-Base-Preview

Text Generation

Model card Files Files and versions

CinnabarLM-4M-Base-Preview / README.md

MihaiPopa-1's picture

Update README.md

80fe33b verified 19 days ago

|

history blame contribute delete

1.86 kB

	---
	license: apache-2.0
	datasets:
	- HuggingFaceFW/fineweb
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- tiny-model
	- cinnabarlm
	- tiny-llm
	- tiny-lm
	- tinylm
	- tinyllm
	new_version: MihaiPopa-1/CinnabarLM-4M-Base
	---

	# CinnabarLM
	CinnabarLM is a tiny, 4M-parameter LLM trained for ~33 minutes on a T4 GPU (on Colab)! It's only 16 MB in size!

	# Why?
	Because it's a good idea to make tiny LLMs. Some people already did with [MicroLM](https://huggingface.co/CromIA/MicroLM-1M), [Spark 4 5M](https://huggingface.co/LH-Tech-AI/Spark-5M-Base-v4) and [Tenete 8M](https://huggingface.co/Harley-ml/Tenete-8M), but not myself!

	# Model Configurations
	\| Parameter \| Value \|
	\|---\|---\|
	\| Tokenizer \| Custom BPE tokenizer \|
	\| Vocabulary Size \| 4096 tokens \|
	\| Batch Size \| 64 \|
	\| Context Window \| 256 tokens \|
	\| `n_embed` \| 192 \|
	\| `n_head` \| 8 \|
	\| `n_layer` \| 6 \|
	\| Dropout \| 0.1 \|

	# Training Configurations
	\| Hyperparameter \| Value \|
	\|---\|---\|
	\| `max_iters` \| 10000 \|
	\| `eval_interval` \| 500 \|
	\| `learning_rate` \| 6e-4 \|
	\| `min_lr` \| 6e-5 \|
	\| `warmup_iters` \| 500 \|
	\| `weight_decay` \| 0.1 \|
	\| `beta1, beta2` \| 0.9, 0.95 \|

	# Limitations
	* Not Instruction-Tuned: It's only a base model, so it only completes text.
	* English-Only: It's trained on English data (FineWeb), it's NOT multilingual.
	* Not a Standard Model: It's NOT a Qwen/Llama/GPT model. Standard Transformers can't recognize this!
	* Preview: This is a preview version, it generates gibberish often. CinnabarLM 1 will solve this with Llama.

	# Some other details
	* It's trained on 80 million tokens of [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) (CC-MAIN-2025-26 snapshot), and the knowledge cutoff is June 2025.
	* The name "CinnabarLM" that I picked was made by combining "Cinnabar" (the new block from the Chaos Cubed drop in Minecraft) + "LM" (Language Model)