Model Card β€” Handwritten Digit Classifier (CNN)

A Convolutional Neural Network (CNN) trained on the MNIST dataset to classify handwritten digits (0–9) with high accuracy. Designed for real-time inference in a web-based drawing interface.


Model Details

Model Description

This model is a CNN trained from scratch on the MNIST benchmark dataset. It accepts 28Γ—28 grayscale images of handwritten digits and outputs a probability distribution over 10 classes (digits 0–9). It is the backbone of the Digit Classifier web app.

  • Developed by: Abdul Rafay
  • Model type: Convolutional Neural Network (CNN)
  • Language(s): N/A (Computer Vision β€” image input only)
  • License: MIT
  • Framework: PyTorch 2.0+
  • Finetuned from: Trained from scratch (no pretrained base)

Model Sources


digit_classifier(1)

Uses

Direct Use

This model can be used directly to classify 28Γ—28 grayscale images of handwritten digits β€” no fine-tuning required. It is best suited for:

  • Educational demos of deep learning and CNNs
  • Handwritten digit recognition in controlled environments
  • Integration into apps via the provided web UI or API

Downstream Use

The model can be fine-tuned or adapted for:

  • Multi-digit number recognition (e.g., street numbers, forms)
  • Similar single-character classification tasks
  • Transfer learning baseline for other image classification problems

Out-of-Scope Use

This model is not suitable for:

  • Recognizing letters, symbols, or non-digit characters
  • Noisy, real-world document scans without preprocessing
  • Multi-digit or multi-character sequences in a single image
  • Safety-critical systems (e.g., medical, legal document processing)

Bias, Risks, and Limitations

  • Dataset bias: MNIST digits are clean, centered, and size-normalized. The model may underperform on digits written in non-Western styles, extreme stroke widths, or unusual orientations.
  • Domain shift: Performance degrades on images that differ significantly from the MNIST distribution (e.g., photos of digits on paper, different fonts).
  • No uncertainty calibration: The model outputs softmax probabilities, which may appear confident even on out-of-distribution inputs.

Recommendations

  • Preprocess input images to 28Γ—28 grayscale and center/normalize digits before inference.
  • Do not rely on model confidence scores alone β€” add a rejection threshold for production use.
  • Evaluate on your specific distribution before deploying in any real-world scenario.

How to Get Started with the Model

import torch
from torchvision import transforms
from PIL import Image
from model import Model  # your model definition

# Load model
model = Model()
model.load_state_dict(torch.load("model.pt"))
model.eval()

# Preprocess image
transform = transforms.Compose([
    transforms.Grayscale(),
    transforms.Resize((28, 28)),
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

img = Image.open("digit.png")
tensor = transform(img).unsqueeze(0)  # shape: [1, 1, 28, 28]

# Predict
with torch.no_grad():
    output = model(tensor)
    prediction = output.argmax(dim=1).item()

print(f"Predicted digit: {prediction}")

Training Details

Training Data

  • Dataset: MNIST β€” 70,000 grayscale images (60,000 train / 10,000 test)
  • Input size: 28Γ—28 pixels, single channel
  • Classes: 10 (digits 0–9)

Training Procedure

Preprocessing

  • Images converted to tensors and normalized using MNIST dataset mean (0.1307) and std (0.3081)
  • Training augmentation: random rotation (Β±10Β°), random affine with translation (Β±10%), scale (0.9–1.1Γ—), and shear (Β±5Β°)
  • Test images: normalization only β€” no augmentation

Training Hyperparameters

Parameter Value
Optimizer AdamW
Learning Rate 3e-3 (max, OneCycleLR)
Weight Decay 1e-4
Batch Size 64
Epochs 50
Loss Function CrossEntropyLoss
Label Smoothing 0.1
LR Scheduler OneCycleLR (10% warmup, cosine anneal)
Dropout (conv) 0.25 (Dropout2d)
Dropout (FC) 0.25
Random Seed 23
Training regime fp32

Speeds, Sizes, Times

  • Training time: ~10 minutes on a single GPU (NVIDIA T4, Google Colab)
  • Model parameters: 160,842
  • Inference speed: <50ms per image (CPU)

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluated on the standard MNIST test split β€” 10,000 images not seen during training.

Factors

Evaluation was performed across all 10 digit classes. No disaggregation by subpopulation was conducted (MNIST does not include demographic metadata).

Metrics

  • Accuracy β€” primary metric; proportion of correctly classified digits
  • Confusion Matrix β€” to identify per-class error patterns

Results

Metric Value
Test Accuracy 99.43%

Per-Class Accuracy

Digit Correct Errors Accuracy
0 980 0 100.0%
1 1132 3 99.7%
2 1025 7 99.3%
3 1008 2 99.8%
4 976 6 99.4%
5 885 7 99.2%
6 949 9 99.1%
7 1020 8 99.2%
8 968 6 99.4%
9 1000 9 99.1%

Summary

The model achieves 99.43% accuracy on the MNIST test set (57 total errors out of 10,000). Digit 0 achieves perfect classification. The most challenging classes are 6 and 9 (9 errors each), consistent with their visual similarity.


Model Examination

The model's convolutional filters learn edge detectors and stroke patterns in early layers, which compose into digit-specific features in deeper layers. Standard CNN interpretability techniques (e.g., Grad-CAM) can be applied to visualize which regions most influence predictions.


Environmental Impact

Carbon emissions estimated using the ML Impact Calculator.

Factor Value
Hardware Type NVIDIA T4 GPU
Hours Used ~0.2 hrs (10 min)
Cloud Provider Google Colab
Compute Region Singapore
Carbon Emitted ~0.01 kg COβ‚‚eq (est.)

Technical Specifications

Model Architecture

The model uses 4 convolutional blocks followed by a compact fully connected head.

Convolutional Blocks

Block Layer Output Shape Details
Block 1 Conv2d (32, 28, 28) 32 filters, 3Γ—3, padding=1
BatchNorm2d (32, 28, 28) β€”
ReLU (32, 28, 28) β€”
MaxPool2d (32, 14, 14) 2Γ—2
Dropout2d (32, 14, 14) p=0.25
Block 2 Conv2d (64, 14, 14) 64 filters, 3Γ—3, padding=1
BatchNorm2d (64, 14, 14) β€”
ReLU (64, 14, 14) β€”
MaxPool2d (64, 7, 7) 2Γ—2
Dropout2d (64, 7, 7) p=0.25
Block 3 Conv2d (128, 7, 7) 128 filters, 3Γ—3, padding=1
BatchNorm2d (128, 7, 7) β€”
ReLU (128, 7, 7) β€”
MaxPool2d (128, 3, 3) 2Γ—2
Dropout2d (128, 3, 3) p=0.25
Block 4 Conv2d (256, 3, 3) 256 filters, 1Γ—1 kernel (no pad)
BatchNorm2d (256, 3, 3) β€”
ReLU (256, 3, 3) β€”
MaxPool2d (256, 1, 1) 2Γ—2
Dropout2d (256, 1, 1) p=0.25

Fully Connected Layers

Layer Output Details
Flatten 256 256 Γ— 1 Γ— 1 = 256
Linear 128 + ReLU + Dropout(0.25)
Linear 10 Raw logits

Total Parameters: 160,842

Shape Flow

Input:   (B,   1, 28, 28)
Block 1: (B,  32, 14, 14)
Block 2: (B,  64,  7,  7)
Block 3: (B, 128,  3,  3)
Block 4: (B, 256,  1,  1)
Flatten: (B, 256)
FC1:     (B, 128)
Output:  (B,  10)

Compute Infrastructure

  • Hardware: NVIDIA T4 GPU (Google Colab)
  • Software: Python 3.10+, PyTorch 2.0, torchvision

Citation

If you use this model in your work, please cite:

BibTeX:

@misc{digit-classifier-2026,
  author    = {Abdul Rafay},
  title     = {Handwritten Digit Classifier (CNN on MNIST)},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/abdurafay19/Digit-Classifier}
}

APA:

Abdul Rafay. (2026). Handwritten Digit Classifier (CNN on MNIST). Hugging Face. https://huggingface.co/abdurafay19/Digit-Classifier


Glossary

Term Definition
CNN Convolutional Neural Network β€” a deep learning architecture suited for image data
MNIST A benchmark dataset of 70,000 handwritten digit images
Softmax Activation function that converts raw outputs to probabilities summing to 1
Dropout Regularization technique that randomly disables neurons during training
BatchNorm Batch Normalization β€” normalizes layer activations to stabilize and speed up training
OneCycleLR Learning rate schedule with warmup and cosine decay for faster convergence
Label Smoothing Softens hard targets to reduce overconfidence and improve generalization
Grad-CAM Gradient-weighted Class Activation Mapping β€” a model interpretability technique

Model Card Authors

Abdul Rafay β€” abdulrafay17wolf@gmail.com

Model Card Contact

For questions or issues, open a GitHub issue at github.com/abdurafay19/Digit-Classifier or reach out via Hugging Face.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train abdurafay19/Digit-Classifier

Space using abdurafay19/Digit-Classifier 1