File size: 2,000 Bytes
c41f065
 
 
 
 
 
 
 
 
 
 
 
 
a1a4c00
c41f065
a1a4c00
 
20843ab
4e9fab6
20f74c7
 
81afa0c
 
 
 
 
20843ab
a1a4c00
13351e2
47fa26c
fec45ff
b0236d2
 
 
 
 
 
 
 
8a8faa7
b0236d2
 
 
13351e2
20f74c7
13351e2
20f74c7
 
 
13351e2
20f74c7
 
 
13351e2
20f74c7
13351e2
20f74c7
 
a1a4c00
47fa26c
fec45ff
13351e2
fec45ff
9694637
 
13351e2
9694637
3ac1898
e3c4544
 
 
b3587df
 
 
 
a4faf0a
 
b3587df
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
license: mit
datasets:
- ylecun/mnist
tags:
- harley-ml
- image
- digit-to-image
- mnist
- small
- text-to-image
---

# **MNiST-IMG-390k**

## Sumary

```
Task: Number-To-Image
Dataset: ylecun/mnist
Total training time: ~10 minutes
Inputs: Number (0-9) 
Outputs: 32x32 image
Params: ~391k
Framework: PyTorch, diffusers
Author: Paul Courneya (Harley-ml)
```

## **Description**
MNiST-IMG-390k is an ~**390k parameter model** trained to **generate an image** based on an **input number (0-9)**. 

## Architecture

| Parameter            | Value          |
| -------------------- | -------------- |
| `image_size`         | `32`           |
| `in_channels`        | `1`            |
| `out_channels`       | `1`            |
| `num_classes`        | `10`           |
| `block_out_channels` | `[12, 16, 20]` |
| `layers_per_block`   | `8`            |
| `norm_num_groups`    | `4`            |

## **Training**

### **Hardware**

MNiST-IMG was trained on Google Colaboratory (NVIDA Tesla T4) for ~10 minutes with a batch size of 64 for 10 epochs.

### **Dataset**

[ylecun/mnist](https://huggingface.co/ylecun/mnist)

### **Training Results**

Loss ended at ~**0.39**.

Note: I can't provide the raw training logs as I loss it somehwere after training. Sorry!

## **Generation Examples**

At **1000** decoding steps:

![1000 Decoding Step Digit Image Generation](images/digit_image_samples_1000s.png)

At **200** decoding steps:

![200 Decoding Step Generation Image](images/digit_image_samples_200s.png)

# Inference

Use the script in the repo. [inference.py](https://huggingface.co/Harley-ml/MNIST-IMG-390k/blob/main/inference.py)

### Related Models

1. [SupraMNST-IMG-200k](https://huggingface.co/SupraLabs/SupraMNST-IMG-200k)

# Citation

```bibtex
@misc{mnist-img-390k,
  title     = {MNIST-IMG-390k: a Tiny Diffusion Model for Generating Handwritten Digits},
  author    = {Paul Courneya; Harley-ml},
  year      = {2026},
  url       = {https://huggingface.co/Harley-ml/MNIST-IMG-390k}
}
```