Maxim Kruglikov
Bump sdk_version to 5.50.0; drop gradio pin from requirements.txt
6b38433
---
title: MegaStyle Image Style Comparison
emoji: 🎨
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.50.0
python_version: "3.10"
app_file: app.py
pinned: false
short_description: Compare image style similarity with MegaStyle-Encoder
license: mit
---
> **Deploying this Space:** select **ZeroGPU** hardware in the Space's *Settings β†’ Hardware*
> panel after creating it. ZeroGPU is not configured via frontmatter.
# MegaStyle Image Style Comparison
Upload a **test image** and **1–8 reference images**, hit **Compare styles**, and get a
style-similarity score (0–100) plus a human-readable verdict. Powered by
[MegaStyle-Encoder](https://huggingface.co/Gaojunyao/MegaStyle), a SigLIP-based style encoder
trained on the 1.4M-image [MegaStyle dataset](https://huggingface.co/datasets/tencent/MegaStyle-1.4M)
with style-supervised contrastive learning β€” see paper
[MegaStyle (arXiv:2604.08364)](https://arxiv.org/abs/2604.08364).
## How it works
1. Each image is embedded with MegaStyle-Encoder into a unit-length style vector.
2. Cosine similarity between the test vector and each reference vector gives a per-reference score.
3. The headline score is the **mean** of those per-reference scores, shown as a percentage for
readability. A per-reference table is shown below for transparency.
## Verdict labels
The verdict is a heuristic bucketing of the cosine-similarity score:
| Score range | Label |
|-------------|-------|
| `β‰₯ 0.75` | 🟒 Strong style match |
| `0.65 – 0.75` | 🟒 Good style match |
| `0.55 – 0.65` | 🟑 Moderate style match |
| `0.45 – 0.55` | 🟠 Weak style match |
| `< 0.45` | πŸ”΄ Minimal style match |
These thresholds are not calibrated against ground-truth style labels β€” they are rule-of-thumb
bands tuned for the typical cosine-similarity range of SigLIP-family encoders (where even
unrelated images can sit around 0.4–0.6). Treat the raw number as the source of truth.
## Credits
- Paper: Gao et al., *MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent
Text-to-Image Style Mapping*, arXiv:2604.08364, 2026.
- Upstream code: [Tencent/MegaStyle](https://github.com/Tencent/MegaStyle)
- Model weights: [Gaojunyao/MegaStyle](https://huggingface.co/Gaojunyao/MegaStyle) (MIT)
- Backbone: [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384) (Apache-2.0)