mini-style-transfer

A small, fast artistic style transfer model built with PyTorch as a learning project.
Applies 4 artistic styles to any photo in under 1 second on CPU.

Based on Johnson et al. (2016) — Perceptual Losses for Real-Time Style Transfer.

What it does

Input photo	+ Style painting	→ Output
Any photo (any size)	Starry Night / Mosaic / Candy / Sketch	Stylised version

Styles available

File	Style
`starry_night.pth`	Van Gogh — Starry Night
`mosaic.pth`	Classic mosaic tile pattern
`candy.pth`	Bright candy colours
`sketch.pth`	Pencil sketch look

Quick start

import torch
from torchvision import transforms
from PIL import Image
from model import StyleNet

# 1. Load model
model = StyleNet()
model.load_state_dict(torch.load("starry_night.pth", map_location="cpu"))
model.eval()

# 2. Prepare your image
img = Image.open("my_photo.jpg").convert("RGB")
to_tensor = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225]),
])
tensor = to_tensor(img).unsqueeze(0)

# 3. Run inference
with torch.no_grad():
    output = model(tensor).squeeze(0).clamp(0, 1)

# 4. Save result
result = transforms.ToPILImage()(output)
result.save("styled_output.jpg")
print("Done! Open styled_output.jpg")

Or use the included run.py script:

python run.py --model starry_night.pth --input my_photo.jpg --output result.jpg

Model details

Property	Value
Architecture	Feed-forward CNN (Encoder → 5× ResBlock → Decoder)
Parameters	~450K
Model size	~1.7 MB per style
Input	Any RGB image, any resolution
Output	Same size as input, styled
Framework	PyTorch 2.x
Normalisation	ImageNet mean/std

Training details

Property	Value
Content dataset	MS-COCO train2017 (subset)
Style images	4 artwork images
Epochs	2 per style
Batch size	4
Image size (training)	256 × 256
Optimizer	Adam, lr=1e-3
Loss	Perceptual (VGG16) — content + style
Content weight	1.0
Style weight	1e5
Training time	~45 min per style (GPU)

Repository structure

mini-style-transfer/
├── model.py            ← StyleNet architecture
├── train.py            ← Training script
├── run.py              ← Inference script
├── starry_night.pth    ← Trained weights (starry night style)
├── mosaic.pth          ← Trained weights (mosaic style)
├── candy.pth           ← Trained weights (candy style)
├── sketch.pth          ← Trained weights (sketch style)
└── README.md           ← This file

Limitations

Each style is a separate model file — there is no single multi-style model yet
Works best on natural photos (landscapes, portraits, cities)
Cartoons, diagrams, and text-heavy images may give unexpected results
Training images were 256×256; very high-resolution outputs may look slightly blurry
Not suitable for commercial use without further evaluation

What I learned building this

How convolutional encoders and decoders work together
What Instance Normalisation does vs Batch Normalisation
How Gram matrices capture texture and style
What perceptual loss is and why pixel-level loss looks bad for style transfer
How to use a pretrained VGG network as a feature extractor without training it

References

Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Gatys, L., Ecker, A., & Bethge, M. (2015). A Neural Algorithm of Artistic Style

Built as a learning project. Feedback and suggestions welcome!

Downloads last month: -; Downloads are not tracked for this model. How to track

Papers for Ateshh/mini-style-transfer

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Paper • 1603.08155 • Published Mar 27, 2016

A Neural Algorithm of Artistic Style

Paper • 1508.06576 • Published Aug 26, 2015