SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 6 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
5	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q9014435: Template:Rfd links Merged with Q4063392 . Jssfrk ([[User talk:Jssfrk
1	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q6995874: Neraidochori (Q6995874) : village in Thessaly, Greece : ( [MASK]
4	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q12640552: Template:Rfd links Empty, leftover from merging.-- Pütz M. ([[User talk:Pütz M.
2	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q32113012: Category:Dames, Special Class of the Order of the Starry Cross (Q32113012) : Wikimedia category : ( [MASK]
0	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q755378: Beia (Q755378) : village in Brașov County, Romania : ( [MASK]
3	"###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q19622665: hydrazine sulfate (Q19622665) : chemical compound : ( [MASK]

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("research-dump/bge-base-en-v1.5_wikidata_entity_outcome_prediction_v1")
# Run inference
preds = model("###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input:  Q16629320: Template:Rfd links Merged with Q15628951 , via The Game  -- Moxfyre ([[User talk:Moxfyre| int:Talkpagelinktext ]]) 18:14, 2 July 2014 (UTC)")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	29	52.91	991

Label	Training Sample Count
0	1
1	514
2	12
3	1
4	39
5	133

Training Hyperparameters

batch_size: (8, 2)
num_epochs: (1, 16)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 50
body_learning_rate: (0.0001, 0.0001)
head_learning_rate: 5e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: True
use_amp: True
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0001	1	0.1493	-
0.0571	500	0.1114	0.1701
0.1143	1000	0.0474	0.1838
0.1714	1500	0.0418	0.1427
0.2286	2000	0.0317	0.1665
0.2857	2500	0.0296	0.1820
0.3429	3000	0.022	0.1714
0.4	3500	0.0245	0.1899
0.4571	4000	0.0222	0.1951
0.5143	4500	0.0176	0.2051
0.5714	5000	0.0134	0.2062
0.6286	5500	0.0099	0.2131
0.6857	6000	0.0086	0.2020
0.7429	6500	0.009	0.1906
0.8	7000	0.0042	0.1960
0.8571	7500	0.0032	0.1942
0.9143	8000	0.0028	0.1941
0.9714	8500	0.0035	0.1951

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.3.1
Transformers: 4.44.1
PyTorch: 2.2.1+cu121
Datasets: 2.21.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 11

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for research-dump/bge-base-en-v1.5_wikidata_entity_outcome_prediction_v1

Base model

BAAI/bge-base-en-v1.5

Finetuned

(458)

this model

Paper for research-dump/bge-base-en-v1.5_wikidata_entity_outcome_prediction_v1

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 5