Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Paper β’ 1603.08155 β’ Published
A small, fast artistic style transfer model built with PyTorch as a learning project.
Applies 4 artistic styles to any photo in under 1 second on CPU.
Based on Johnson et al. (2016) β Perceptual Losses for Real-Time Style Transfer.
| Input photo | + Style painting | β Output |
|---|---|---|
| Any photo (any size) | Starry Night / Mosaic / Candy / Sketch | Stylised version |
| File | Style |
|---|---|
starry_night.pth |
Van Gogh β Starry Night |
mosaic.pth |
Classic mosaic tile pattern |
candy.pth |
Bright candy colours |
sketch.pth |
Pencil sketch look |
import torch
from torchvision import transforms
from PIL import Image
from model import StyleNet
# 1. Load model
model = StyleNet()
model.load_state_dict(torch.load("starry_night.pth", map_location="cpu"))
model.eval()
# 2. Prepare your image
img = Image.open("my_photo.jpg").convert("RGB")
to_tensor = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
tensor = to_tensor(img).unsqueeze(0)
# 3. Run inference
with torch.no_grad():
output = model(tensor).squeeze(0).clamp(0, 1)
# 4. Save result
result = transforms.ToPILImage()(output)
result.save("styled_output.jpg")
print("Done! Open styled_output.jpg")
Or use the included run.py script:
python run.py --model starry_night.pth --input my_photo.jpg --output result.jpg
| Property | Value |
|---|---|
| Architecture | Feed-forward CNN (Encoder β 5Γ ResBlock β Decoder) |
| Parameters | ~450K |
| Model size | ~1.7 MB per style |
| Input | Any RGB image, any resolution |
| Output | Same size as input, styled |
| Framework | PyTorch 2.x |
| Normalisation | ImageNet mean/std |
| Property | Value |
|---|---|
| Content dataset | MS-COCO train2017 (subset) |
| Style images | 4 artwork images |
| Epochs | 2 per style |
| Batch size | 4 |
| Image size (training) | 256 Γ 256 |
| Optimizer | Adam, lr=1e-3 |
| Loss | Perceptual (VGG16) β content + style |
| Content weight | 1.0 |
| Style weight | 1e5 |
| Training time | ~45 min per style (GPU) |
mini-style-transfer/
βββ model.py β StyleNet architecture
βββ train.py β Training script
βββ run.py β Inference script
βββ starry_night.pth β Trained weights (starry night style)
βββ mosaic.pth β Trained weights (mosaic style)
βββ candy.pth β Trained weights (candy style)
βββ sketch.pth β Trained weights (sketch style)
βββ README.md β This file
Built as a learning project. Feedback and suggestions welcome!