qwen3_compressed

Compact multilingual sentence encoder compressed from Qwen/Qwen3-0.6B (48x compression).

Model Details

Property Value
Base model Qwen/Qwen3-0.6B
Architecture qwen3 (decoder)
Hidden dim 448 (from 1024)
Layers 4 (from 28)
Intermediate 1344
Attention heads 7
KV heads 1
Vocab size 7,341 (from 151,936)
Parameters ~12.4M
Model size (FP32) 47.1MB
Compression 48x
Distilled No

Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("qwen3_compressed", trust_remote_code=True)

sentences = [
    "Hello, how are you?",
    "์•ˆ๋…•ํ•˜์„ธ์š”, ์ž˜ ์ง€๋‚ด์„ธ์š”?",
    "ใ“ใ‚“ใซใกใฏใ€ๅ…ƒๆฐ—ใงใ™ใ‹๏ผŸ",
    "ไฝ ๅฅฝ๏ผŒไฝ ๅฅฝๅ—๏ผŸ",
]

embeddings = model.encode(sentences)
print(embeddings.shape)  # (4, 448)

MTEB Evaluation Results

Overall Average: 24.18%

Task Group Average
Classification 32.61%
Clustering 27.02%
STS 12.92%

Classification

Task Average Details
AmazonCounterfactualClassification 53.9% en-ext: 57.41%, en: 56.0%, de: 51.45%, ja: 50.74%
Banking77Classification 9.18% default: 9.18%
ImdbClassification 50.25% default: 50.25%
MTOPDomainClassification 28.9% th: 37.34%, es: 32.47%, de: 31.03%, en: 28.63%, fr: 26.97%
MassiveIntentClassification 15.91% it: 22.06%, pl: 21.13%, th: 21.1%, sq: 21.03%, sw: 20.97%
MassiveScenarioClassification 17.94% th: 23.36%, sq: 22.48%, cy: 22.45%, sv: 22.22%, it: 22.13%
ToxicConversationsClassification 50.35% default: 50.35%
TweetSentimentExtractionClassification 34.47% default: 34.47%

Clustering

Task Average Details
ArXivHierarchicalClusteringP2P 46.3% default: 46.3%
ArXivHierarchicalClusteringS2S 46.53% default: 46.53%
BiorxivClusteringP2P.v2 5.86% default: 5.86%
MedrxivClusteringP2P.v2 17.63% default: 17.63%
MedrxivClusteringS2S.v2 18.42% default: 18.42%
StackExchangeClustering.v2 40.91% default: 40.91%
StackExchangeClusteringP2P.v2 30.96% default: 30.96%
TwentyNewsgroupsClustering.v2 9.52% default: 9.52%

STS

Task Average Details
BIOSSES 1.1% default: 1.1%
SICK-R 22.32% default: 22.32%
STS12 7.97% default: 7.97%
STS13 18.76% default: 18.76%
STS14 10.4% default: 10.4%
STS15 24.3% default: 24.3%
STS17 10.58% en-en: 25.95%, es-es: 25.71%, ko-ko: 21.6%, ar-ar: 20.19%, it-en: 8.19%
STSBenchmark 7.96% default: 7.96%

Training

Created via multi-method model compression (no additional training):

  1. Teacher: Qwen/Qwen3-0.6B (28L, 1024d, 596M params)
  2. Layer pruning: 28 โ†’ 4 layers (uniform selection)
  3. Hidden dim: 1024 โ†’ 448
  4. Vocab pruning: 151,936 โ†’ 7,341 (90% cumulative frequency)
  5. Compression ratio: 48x

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Downloads last month
17
Safetensors
Model size
12.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support