Question about adapter choice on MTEB Retrieval (retrieval vs text-matching)
Hi Jina team, thanks a lot for releasing Jina Embeddings v4 — it’s a really impressive model.
I have a question about the adapter / task setting you used for the MTEB Retrieval evaluation.
In your technical report, Table A11: “Evaluation Results for Various Models on MTEB Retrieval Task” indicates that results on MTEB Retrieval are obtained using the text-matching adapter. This confused me a bit, because intuitively I would expect the retrieval adapter to be used for retrieval benchmarks.
To sanity-check this, I ran a small reproduction on a couple of MTEB Retrieval datasets and observed behavior that suggests the optimal adapter may differ by dataset:
FiQA: using the retrieval adapter I can reach the reported score (~47.678), while the text-matching adapter gives only ~34.3.
ArguAna: using the text-matching adapter I can reach the reported score (~67.07), while the retrieval adapter is noticeably lower.
So I’m wondering:
Did you actually run different adapters for different Retrieval datasets (even though Table A11 labels them under text-matching)?
If so, is there a recommended mapping (which Retrieval datasets should use retrieval vs text-matching)?
Or is Table A11 possibly mislabeled, and MTEB Retrieval was evaluated with the retrieval adapter (or a mixed strategy)?
Any clarification would be greatly appreciated — I’d love to make sure I’m using the intended setup correctly.
Thanks again!