ntu-bge-small-zh-simcse-job-talent-matching
Fine-tuned BAAI/bge-small-zh-v1.5 using Supervised SimCSE for job-talent matching.
Training Details
- Base model: BAAI/bge-small-zh-v1.5
- Method: Supervised SimCSE (dual encoder + in-batch contrastive loss)
- Task: Job-Talent matching (職缺-人才配對)
- Temperature (τ): 0.05
- Max length: 512
- Batch size: 64
- Optimizer: AdamW (lr=5e-5, wd=1e-2)
- Early stopping: patience=3
Usage
from transformers import AutoTokenizer, AutoModel
import torch
tokenizer = AutoTokenizer.from_pretrained("yenstdi/ntu-bge-small-zh-simcse-job-talent-matching")
model = AutoModel.from_pretrained("yenstdi/ntu-bge-small-zh-simcse-job-talent-matching")
def encode(texts):
inputs = tokenizer(texts, padding=True, truncation=True, max_length=512, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
mask = inputs["attention_mask"].unsqueeze(-1)
embeddings = (outputs.last_hidden_state * mask).sum(1) / mask.sum(1).clamp(min=1e-6)
return torch.nn.functional.normalize(embeddings, dim=-1)
job_emb = encode(["Software Engineer - Python, ML experience required"])
talent_emb = encode(["5 years Python developer with ML projects"])
score = (job_emb @ talent_emb.T).item()
print(f"Cosine similarity: {score:.4f}")
Trained by
National Taiwan University (NTU)
- Downloads last month
- 19
Model tree for yenstdi/ntu-bge-small-zh-simcse-job-talent-matching
Base model
BAAI/bge-small-zh-v1.5