Arabic Matryoshka & GATE Embedding Models
Collection
A collection of advanced Arabic Matryoshka Embedding Models designed for efficient and high-performance Arabic NLP, available publicly on Hugging Face • 12 items • Updated • 16
This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 on the Omartificial-Intelligence-Space/arabic-n_li-triplet dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. This model is part of the Arabic Matryoshka Embedding Models collection. It was presented in the paper GATE: General Arabic Text Embedding for Enhanced Semantic Textual Similarity with Matryoshka Representation Learning and Hybrid Loss Training.