embedl
/

paraphrase-multilingual-MiniLM-L12-v2-quantized-trt

@@ -125,10 +125,23 @@ Latency measured with TensorRT + `trtexec`, GPU compute time only
 Evaluated on the sts17 validation split. The quantized model
 retains nearly all of the FP32 accuracy with a small tolerance.
-| Model | Spearman ρ | spearman_ar_ar | spearman_default | spearman_en_ar | spearman_en_de | spearman_en_en | spearman_en_tr | spearman_es_en | spearman_es_es | spearman_fr_en | spearman_it_en | spearman_ko_ko | spearman_nl_en |
-|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
-| `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` FP32 (ours) | 0.8130 | 0.7915 | 0.7970 | 0.8122 | 0.8422 | 0.8687 | 0.7674 | 0.8444 | 0.8556 | 0.7659 | 0.8235 | 0.7703 | 0.8171 |
-| **Embedl Paraphrase Multilingual Minilm L12 V2 INT8** | **0.8008** | **0.7906** | **0.7868** | **0.7914** | **0.8215** | **0.8638** | **0.7555** | **0.8300** | **0.8328** | **0.7536** | **0.8148** | **0.7628** | **0.8059** |
 ## Creating Your Own Optimized Models

 Evaluated on the sts17 validation split. The quantized model
 retains nearly all of the FP32 accuracy with a small tolerance.
+| Metric | FP32 (ours) | **Embedl Paraphrase Multilingual Minilm L12 V2 INT8** | Δ |
+|---|---|---|---|
+| Spearman ρ | 0.8130 | **0.8008** | -0.0122 |
+| ρ (ar-ar) | 0.7915 | **0.7906** | -0.0010 |
+| ρ (default) | 0.7970 | **0.7868** | -0.0102 |
+| ρ (en-ar) | 0.8122 | **0.7914** | -0.0208 |
+| ρ (en-de) | 0.8422 | **0.8215** | -0.0207 |
+| ρ (en-en) | 0.8687 | **0.8638** | -0.0049 |
+| ρ (en-tr) | 0.7674 | **0.7555** | -0.0119 |
+| ρ (es-en) | 0.8444 | **0.8300** | -0.0143 |
+| ρ (es-es) | 0.8556 | **0.8328** | -0.0228 |
+| ρ (fr-en) | 0.7659 | **0.7536** | -0.0123 |
+| ρ (it-en) | 0.8235 | **0.8148** | -0.0087 |
+| ρ (ko-ko) | 0.7703 | **0.7628** | -0.0075 |
+| ρ (nl-en) | 0.8171 | **0.8059** | -0.0112 |
+FP32 baseline: [`sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2).
 ## Creating Your Own Optimized Models