SigLIP to Arctic Projection Layer

A lightweight linear projection layer that maps SigLIP image embeddings (1152-dim) to Snowflake Arctic-embed-m-v2.0 text embedding space (768-dim).

Model Details

Input: SigLIP-so400m-patch14-384 image embeddings (1152 dimensions)
Output: Arctic-embed-m-v2.0 compatible embeddings (768 dimensions)
Architecture: Linear projection (no hidden layers)
Parameters: ~885K

Training Data

Trained on ~690K image-caption pairs from DataComp-small, filtered for quality.

Performance

Metric	Value
Image-to-Text Recall@1	77.44%
Image-to-Text Recall@5	91.6%
Text-to-Image Recall@1	84.86%
Text-to-Image Recall@5	95.4%
Mean Cosine Similarity	0.458

Usage

import torch
from safetensors.torch import load_file

# Load the projection layer
state_dict = load_file("model.safetensors")
projection = torch.nn.Linear(1152, 768, bias=True)
projection.load_state_dict(state_dict)

# Project SigLIP embeddings to Arctic space
siglip_embeds = ...  # [batch, 1152]
arctic_compatible = projection(siglip_embeds)  # [batch, 768]

Training

Trained using contrastive loss with temperature=0.07 for 2 epochs.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

carsondial
/

christmas-siglip-arctic-projector