e-Procure Product Embeddings

Bilingual (English/Arabic) sentence embeddings fine-tuned for B2B procurement product matching on the e-Procure platform.

Model Description

Fine-tuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 on 48,000 product pairs from Saudi Arabian B2B procurement catalogs. Optimized for matching purchase requests to supplier catalog items across English and Arabic.

Key Capabilities

  • Cross-lingual matching: Match English RFQ terms to Arabic product descriptions and vice versa
  • Industry-specific: Trained on construction, electrical, HVAC, plumbing, and safety equipment catalogs
  • SKU-aware: Understands product codes, part numbers, and technical specifications

Training Data

Category English Pairs Arabic Pairs Cross-lingual
Construction Materials 8,200 6,100 3,400
Electrical Equipment 7,500 5,800 2,900
HVAC Systems 5,100 4,200 2,100
Plumbing Supplies 4,800 3,600 1,800
Safety Equipment 3,900 2,800 1,500

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("brijeshvadi/eprocure-product-embeddings")

queries = ["3-phase circuit breaker 400A", "قاطع دائرة ثلاثي الطور 400 أمبير"]
products = ["ABB SACE Tmax XT4 400A 3P MCCB", "Schneider NSX400N 3P 400A"]

query_emb = model.encode(queries)
product_emb = model.encode(products)

Architecture

  • Base: paraphrase-multilingual-MiniLM-L12-v2
  • Embedding Dim: 384
  • Max Seq Length: 128
  • Pooling: Mean pooling
  • Training Loss: MultipleNegativesRankingLoss + CosineSimilarityLoss

Platform Context

Built for e-Procure, a B2B procurement platform serving Saudi Arabian construction and industrial supply chains. The platform uses Next.js 15, Strapi CMS, and Redux Toolkit Query.

Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train brijeshvadi/eprocure-product-embeddings

Evaluation results