SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	"Reasoning: \n\n1. Context Grounding: The provided document details the events leading to Joan Gaspart’s resignation and confirms it happened after a poor season in 2003.\n2. Relevance: The answer directly addresses the specific question about who resigned from the presidency after Barcelona's poor showing in the 2003 season.\n3. Conciseness: The answer is clear and to the point without unnecessary information.\n\nFinal Result:" "Reasoning:\n1. Context Grounding: The answer appropriately pulls references from the document, mentioning the hazards of working with electricity and the potential for long-term issues if electrical work isn't done correctly, which aligns with the provided content.\n2. Relevance: The answer directly addresses why it is beneficial to hire a professional electrician like O’Hara Electric, explicitly tying into the concerns of safety, expertise, and ensuring the job is done correctly on the first attempt.\n3. Conciseness: The answer is concise and to the point, avoiding unnecessary information and sticking closely to the reasons why hiring a professional is recommended.\n\nFinal Result:" 'Reasoning:\n1. Context Grounding: The provided document explicitly states that Aerosmith's 1987 comeback album was "Permanent Vacation". The answer is directly supported by this information.\n2. Relevance: The answer is directly related to and completely addresses the question about the title of Aerosmith's 1987 comeback album.\n3. Conciseness: The answer is concise and to the point, providing only the necessary information without any extraneous details.\n\nFinal result:'
0	'Reasoning:\n1. Context Grounding: The response effectively uses the provided document to form its recommendations. It pulls together various tips on identifying and avoiding triggers, utilizing sensory substitutes, and participating in alternative activities to manage cravings. \n2. Relevance: The answer remains focused on the question, directly addressing how to stop cravings for smoking by providing actionable advice and methods.\n3. Conciseness: While informative, the answer could benefit from being slightly more concise. Some points, such as the detailed explanation about licorice root,might be trimmed for brevity.\n\nFinal Result:' 'Reasoning:\n1. Context Grounding: The provided answer is rooted in the document, which mentions that Amy Bloom finds starting a project hard and having to clear mental space, recalibrate, and become less involved in her everyday life.\n2. Relevance: The response accurately focuses on the challenges Bloom faces when starting a significant writing project, without deviating into irrelevant areas.\n3. Conciseness: The answer effectively summarizes the relevant information from the document, staying clear and to the point while avoiding unnecessary detail.\n\nFinal Result:' 'Reasoning:\n1. Context Grounding: The provided document does have a listing for a 6 bedroom detached house. The factual details in the answer given by the user are not present in the document.\n2. Relevance: The user’s answer lists a different price (£2,850,000 vs. £950,000) and an incorrect address (Highgate Lane, Leeds, Berkshire, RG12 vs Willow Drive, Twyford, Reading, Berkshire, RG10) than the provided document.\n3. Conciseness: The user’s answer, though concise, is factually incorrect based on the document.\n\nFinal Result:'

Label

Examples

"Reasoning: \n\n1. Context Grounding: The provided document details the events leading to Joan Gaspart’s resignation and confirms it happened after a poor season in 2003.\n2. Relevance: The answer directly addresses the specific question about who resigned from the presidency after Barcelona's poor showing in the 2003 season.\n3. Conciseness: The answer is clear and to the point without unnecessary information.\n\nFinal Result:"
"Reasoning:\n1. Context Grounding: The answer appropriately pulls references from the document, mentioning the hazards of working with electricity and the potential for long-term issues if electrical work isn't done correctly, which aligns with the provided content.\n2. Relevance: The answer directly addresses why it is beneficial to hire a professional electrician like O’Hara Electric, explicitly tying into the concerns of safety, expertise, and ensuring the job is done correctly on the first attempt.\n3. Conciseness: The answer is concise and to the point, avoiding unnecessary information and sticking closely to the reasons why hiring a professional is recommended.\n\nFinal Result:"
'Reasoning:\n1. Context Grounding: The provided document explicitly states that Aerosmith's 1987 comeback album was "Permanent Vacation". The answer is directly supported by this information.\n2. Relevance: The answer is directly related to and completely addresses the question about the title of Aerosmith's 1987 comeback album.\n3. Conciseness: The answer is concise and to the point, providing only the necessary information without any extraneous details.\n\nFinal result:'

'Reasoning:\n1. Context Grounding: The response effectively uses the provided document to form its recommendations. It pulls together various tips on identifying and avoiding triggers, utilizing sensory substitutes, and participating in alternative activities to manage cravings. \n2. Relevance: The answer remains focused on the question, directly addressing how to stop cravings for smoking by providing actionable advice and methods.\n3. Conciseness: While informative, the answer could benefit from being slightly more concise. Some points, such as the detailed explanation about licorice root,might be trimmed for brevity.\n\nFinal Result:'
'Reasoning:\n1. Context Grounding: The provided answer is rooted in the document, which mentions that Amy Bloom finds starting a project hard and having to clear mental space, recalibrate, and become less involved in her everyday life.\n2. Relevance: The response accurately focuses on the challenges Bloom faces when starting a significant writing project, without deviating into irrelevant areas.\n3. Conciseness: The answer effectively summarizes the relevant information from the document, staying clear and to the point while avoiding unnecessary detail.\n\nFinal Result:'
'Reasoning:\n1. Context Grounding: The provided document does have a listing for a 6 bedroom detached house. The factual details in the answer given by the user are not present in the document.\n2. Relevance: The user’s answer lists a different price (£2,850,000 vs. £950,000) and an incorrect address (Highgate Lane, Leeds, Berkshire, RG12 vs Willow Drive, Twyford, Reading, Berkshire, RG10) than the provided document.\n3. Conciseness: The user’s answer, though concise, is factually incorrect based on the document.\n\nFinal Result:'

Evaluation

Metrics

Label	Accuracy
all	0.8108

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wikisum_gpt-4o_cot-instructions_remove_final_evaluation_e1_larger_train_1")
# Run inference
preds = model("Reasoning: The given answer is well-supported by the provided document and includes details that are directly relevant to the identification of a funnel spider, such as the dark brown or black body and the presence of a hard, shiny carapace. It also mentions the two large fangs which is a key characteristic. The answer stays focused on the specific question of identifying a funnel spider, without deviating into unrelated topics.

Final Result:")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	33	88.6482	198

Label	Training Sample Count
0	95
1	104

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0020	1	0.2432	-
0.1004	50	0.256	-
0.2008	100	0.2208	-
0.3012	150	0.0894	-
0.4016	200	0.0315	-
0.5020	250	0.0065	-
0.6024	300	0.0025	-
0.7028	350	0.0022	-
0.8032	400	0.002	-
0.9036	450	0.002	-

Framework Versions

Python: 3.10.14
SetFit: 1.1.0
Sentence Transformers: 3.1.1
Transformers: 4.44.0
PyTorch: 2.4.0+cu121
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Netta1994/setfit_baai_wikisum_gpt-4o_cot-instructions_remove_final_evaluation_e1_larger_train_1

Base model

BAAI/bge-base-en-v1.5

Finetuned

(458)

this model

Paper for Netta1994/setfit_baai_wikisum_gpt-4o_cot-instructions_remove_final_evaluation_e1_larger_train_1

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 5

Evaluation results

Accuracy on Unknown
test set self-reported

0.811