Spaces:

olfronar
/

megastyle-comparison

Running on Zero

App Files Files Community

megastyle-comparison / README.md

Maxim Kruglikov

Bump sdk_version to 5.50.0; drop gradio pin from requirements.txt

6b38433 17 days ago

preview code

raw

history blame contribute delete

2.4 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

metadata

title: MegaStyle Image Style Comparison
emoji: 🎨
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.50.0
python_version: '3.10'
app_file: app.py
pinned: false
short_description: Compare image style similarity with MegaStyle-Encoder
license: mit

Deploying this Space: select ZeroGPU hardware in the Space's Settings → Hardware panel after creating it. ZeroGPU is not configured via frontmatter.

MegaStyle Image Style Comparison

Upload a test image and 1–8 reference images, hit Compare styles, and get a style-similarity score (0–100) plus a human-readable verdict. Powered by MegaStyle-Encoder, a SigLIP-based style encoder trained on the 1.4M-image MegaStyle dataset with style-supervised contrastive learning — see paper MegaStyle (arXiv:2604.08364).

How it works

Each image is embedded with MegaStyle-Encoder into a unit-length style vector.
Cosine similarity between the test vector and each reference vector gives a per-reference score.
The headline score is the mean of those per-reference scores, shown as a percentage for readability. A per-reference table is shown below for transparency.

Verdict labels

The verdict is a heuristic bucketing of the cosine-similarity score:

Score range	Label
`≥ 0.75`	🟢 Strong style match
`0.65 – 0.75`	🟢 Good style match
`0.55 – 0.65`	🟡 Moderate style match
`0.45 – 0.55`	🟠 Weak style match
`< 0.45`	🔴 Minimal style match

These thresholds are not calibrated against ground-truth style labels — they are rule-of-thumb bands tuned for the typical cosine-similarity range of SigLIP-family encoders (where even unrelated images can sit around 0.4–0.6). Treat the raw number as the source of truth.

Credits

Paper: Gao et al., MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping, arXiv:2604.08364, 2026.
Upstream code: Tencent/MegaStyle
Model weights: Gaojunyao/MegaStyle (MIT)
Backbone: google/siglip-so400m-patch14-384 (Apache-2.0)