SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 10 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
5	'"All of the cited sources do mention her, and enable reliable sourcing of her childhood, education, and acting career. There are reviews of her acting that can be added. However, the reason why I created this article is her play ''How to Load a Musket'' which I believe passes . 3. ""The person has created... (a work that) ha(s) been the primary subject... of multiple independent periodical articles or reviews"' 'I concur. The company does not appear to meet and
6	'"I was the one who put this up for deletion, and I almost want to change my own vote, because of the disregard for the rules which people are performing by excercising such predjudice against this page. Come on people, the problem is that there are no reliable secondary sources as of yet, not that ""evil bloggers"" are trying to rule the world, or that neolgisms must be squooshed without mercy. The issue here is lack of established credible sources because it's way too young to have sources that qualify under "' "Um, with all due respect, her widely publicized grassroots-organized ouster is what she is notable for, if being an elected official somehow wasn't enough. Nominator hasn't even looked at the massive amount of supporting media coverage, otherwise he would have known that the contents of the since-removed YouTube video are well documented by
8	'101 Newsbank hits on the case. a memorial, but writing biographies on individuals who became newsworthy is very much within the scope of Wikipedia' "as noted by the nominator, deleted from ru.wiki. Seems to be part of a promo campaign, which is rather pointless as all ext links point to a Russian language site, such as this http://hlebsobor.ru/%D0%B0%D0%BB%D0%B5%D0%BA%D1%81%D0%B0%D0%BD%D0%B4%D1%80-%D1%81%D0%B5%D1%80%D0%B3%D0%B5%D0%B5%D0%B2%D0%B8%D1%87-%D1%80%D0%BE%D0%BC%D0%B0%D0%BD%D0%BE%D0%B2/ author's page. Delete per / " 'Firstly, this is becoming an
2	'as Carlos Suarez has pointed out this is an unsourced violation which even if sourced would likely fail our biographical guidelines anyhow. Lose-lose' "While he isn't super notable, I don't see why he should be excluded any more than say a no-name backbench MP from one of the major parties should be excluded. Ultimately, he received considerable media coverage as a result of being elected to federal Parliament – and then again when he subsequently lost his spot when the results were declared void. He again ran for election at the special election, and there was coverage on him following his failure to gain a seat at that. It's really more than just
4	'unsourced, unverified - could be
3	'complete nonsense, meets #G1
7	'"per nom. Insufficiently notable publication, and no ""''credible, third-party sources with a reputation for fact-checking and accuracy''"" as required by "' 'I mentioned the NYT link for purposes, not , as I believe was clear' "As it was mentioned, there are remarkable claims being made in the article that need to be
9	'as the one who PRODded it; I was tempted to tag its initial incarnation as G11, but I still think it should be deleted per ' "I don't understand why you're all voting to keep. Don't you know what a son of a bitch is? Isn't policy, or has that been rejected now" 'Big ole mess of and '
1	'Bot-like nomination made without any
0	'The nominator wants references? I just added a couple - it was easy as there are http://scholar.google.co.uk/scholar?q=+%22chain+smoking%22&hl=en&lr=&start=10&sa=N thousands of sources out there. Image:Life_Preserver.svg

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("research-dump/bge-base-en-v1.5_wikipedia_policy_wikipedia_policy")
# Run inference
preds = model("fails ")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	2	38.196	433

Label	Training Sample Count
0	23
1	17
2	21
3	17
4	39
5	671
6	60
7	36
8	100
9	16

Training Hyperparameters

batch_size: (8, 2)
num_epochs: (5, 5)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 10
body_learning_rate: (1e-05, 1e-05)
head_learning_rate: 5e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: True
use_amp: True
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0004	1	0.2133	-
0.2	500	0.2428	0.2210
0.4	1000	0.1484	0.1927
0.6	1500	0.0528	0.1995
0.8	2000	0.0335	0.2373
1.0	2500	0.0346	0.2294
1.2	3000	0.0267	0.2447
1.4	3500	0.0239	0.2290
1.6	4000	0.0253	0.2354
1.8	4500	0.0219	0.2390
2.0	5000	0.02	0.2335
2.2	5500	0.019	0.2319
2.4	6000	0.0168	0.2281
2.6	6500	0.0154	0.2499
2.8	7000	0.013	0.2537
3.0	7500	0.015	0.2408
3.2	8000	0.0121	0.2423
3.4	8500	0.015	0.2391
3.6	9000	0.0131	0.2452
3.8	9500	0.0106	0.2438
4.0	10000	0.0135	0.2330
4.2	10500	0.0114	0.2396
4.4	11000	0.0115	0.2413
4.6	11500	0.0112	0.2348
4.8	12000	0.0111	0.2378
5.0	12500	0.013	0.2387

Framework Versions

Python: 3.12.7
SetFit: 1.1.1
Sentence Transformers: 3.4.1
Transformers: 4.48.2
PyTorch: 2.6.0+cu124
Datasets: 3.2.0
Tokenizers: 0.21.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for research-dump/bge-base-en-v1.5_wikipedia_policy_wikipedia_policy

Base model

BAAI/bge-base-en-v1.5

Finetuned

(458)

this model

Paper for research-dump/bge-base-en-v1.5_wikipedia_policy_wikipedia_policy

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 5