DINOv2-small + MLPHead Aesthetic Scorer
A lightweight aesthetic regression model built on a frozen facebook/dinov2-small backbone
with a trainable MLP head that predicts an aesthetic score from image embeddings.
Architecture
| Component | Details |
|---|---|
| Backbone | facebook/dinov2-small (frozen, not included in this checkpoint) |
| Input | CLS token — shape (B, 384) |
| Head | Linear(384->256) -> GELU -> Dropout(0.3) -> Linear(256->1) |
| Output | Scalar aesthetic score per image |
Checkpoint Contents
This .pt file contains only the MLPHead state dict (4 tensors).
The DINOv2 backbone is loaded separately from facebook/dinov2-small.
Usage
import torch
import torch.nn as nn
from transformers import AutoImageProcessor, Dinov2Model
from huggingface_hub import hf_hub_download
from PIL import Image
class MLPHead(nn.Module):
def __init__(self, embed_dim=384, hidden_dim=256, dropout_p=0.3):
super().__init__()
self.net = nn.Sequential(
nn.Linear(embed_dim, hidden_dim),
nn.GELU(),
nn.Dropout(dropout_p),
nn.Linear(hidden_dim, 1),
)
def forward(self, x):
return self.net(x).squeeze(-1)
# Load backbone
processor = AutoImageProcessor.from_pretrained("facebook/dinov2-small")
backbone = Dinov2Model.from_pretrained("facebook/dinov2-small").eval()
# Load head
ckpt_path = hf_hub_download(repo_id="grantmwilkinson/dinov2-small-mlphead-aesthetic", filename="dinov2-small_MLPHead_best.pt")
head = MLPHead()
head.load_state_dict(torch.load(ckpt_path, map_location="cpu", weights_only=True))
head.eval()
# Inference
image = Image.open("your_image.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
cls_token = backbone(**inputs).last_hidden_state[:, 0] # (1, 384)
score = head(cls_token) # (1,)
print(f"Aesthetic score: {score.item():.4f}")
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for grantmwilkinson/dinov2-small-mlphead-aesthetic
Base model
facebook/dinov2-small