| --- |
| license: mit |
| datasets: |
| - ylecun/mnist |
| tags: |
| - harley-ml |
| - image |
| - digit-to-image |
| - mnist |
| - small |
| - text-to-image |
| --- |
| |
| # **MNiST-IMG-390k** |
|
|
| ## Sumary |
|
|
| ``` |
| Task: Number-To-Image |
| Dataset: ylecun/mnist |
| Total training time: ~10 minutes |
| Inputs: Number (0-9) |
| Outputs: 32x32 image |
| Params: ~391k |
| Framework: PyTorch, diffusers |
| Author: Paul Courneya (Harley-ml) |
| ``` |
|
|
| ## **Description** |
| MNiST-IMG-390k is an ~**390k parameter model** trained to **generate an image** based on an **input number (0-9)**. |
|
|
| ## Architecture |
|
|
| | Parameter | Value | |
| | -------------------- | -------------- | |
| | `image_size` | `32` | |
| | `in_channels` | `1` | |
| | `out_channels` | `1` | |
| | `num_classes` | `10` | |
| | `block_out_channels` | `[12, 16, 20]` | |
| | `layers_per_block` | `8` | |
| | `norm_num_groups` | `4` | |
|
|
| ## **Training** |
|
|
| ### **Hardware** |
|
|
| MNiST-IMG was trained on Google Colaboratory (NVIDA Tesla T4) for ~10 minutes with a batch size of 64 for 10 epochs. |
|
|
| ### **Dataset** |
|
|
| [ylecun/mnist](https://huggingface.co/ylecun/mnist) |
|
|
| ### **Training Results** |
|
|
| Loss ended at ~**0.39**. |
|
|
| Note: I can't provide the raw training logs as I loss it somehwere after training. Sorry! |
|
|
| ## **Generation Examples** |
|
|
| At **1000** decoding steps: |
|
|
|  |
|
|
| At **200** decoding steps: |
|
|
|  |
|
|
| # Inference |
|
|
| Use the script in the repo. [inference.py](https://huggingface.co/Harley-ml/MNIST-IMG-390k/blob/main/inference.py) |
|
|
| ### Related Models |
|
|
| 1. [SupraMNST-IMG-200k](https://huggingface.co/SupraLabs/SupraMNST-IMG-200k) |
|
|
| # Citation |
|
|
| ```bibtex |
| @misc{mnist-img-390k, |
| title = {MNIST-IMG-390k: a Tiny Diffusion Model for Generating Handwritten Digits}, |
| author = {Paul Courneya; Harley-ml}, |
| year = {2026}, |
| url = {https://huggingface.co/Harley-ml/MNIST-IMG-390k} |
| } |
| ``` |