SetFit with BAAI/bge-base-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
| Label |
Examples |
| 1 |
- "Reasoning: \n\n1. Context Grounding: The provided document details the events leading to Joan Gaspart’s resignation and confirms it happened after a poor season in 2003.\n2. Relevance: The answer directly addresses the specific question about who resigned from the presidency after Barcelona's poor showing in the 2003 season.\n3. Conciseness: The answer is clear and to the point without unnecessary information.\n\nFinal Result:"
- "Reasoning:\n1. Context Grounding: The answer appropriately pulls references from the document, mentioning the hazards of working with electricity and the potential for long-term issues if electrical work isn't done correctly, which aligns with the provided content.\n2. Relevance: The answer directly addresses why it is beneficial to hire a professional electrician like O’Hara Electric, explicitly tying into the concerns of safety, expertise, and ensuring the job is done correctly on the first attempt.\n3. Conciseness: The answer is concise and to the point, avoiding unnecessary information and sticking closely to the reasons why hiring a professional is recommended.\n\nFinal Result:"
- 'Reasoning:\n1. Context Grounding: The provided document explicitly states that Aerosmith's 1987 comeback album was "Permanent Vacation". The answer is directly supported by this information.\n2. Relevance: The answer is directly related to and completely addresses the question about the title of Aerosmith's 1987 comeback album.\n3. Conciseness: The answer is concise and to the point, providing only the necessary information without any extraneous details.\n\nFinal result:'
|
| 0 |
- 'Reasoning:\n1. Context Grounding: The response effectively uses the provided document to form its recommendations. It pulls together various tips on identifying and avoiding triggers, utilizing sensory substitutes, and participating in alternative activities to manage cravings. \n2. Relevance: The answer remains focused on the question, directly addressing how to stop cravings for smoking by providing actionable advice and methods.\n3. Conciseness: While informative, the answer could benefit from being slightly more concise. Some points, such as the detailed explanation about licorice root,might be trimmed for brevity.\n\nFinal Result:'
- 'Reasoning:\n1. Context Grounding: The provided answer is rooted in the document, which mentions that Amy Bloom finds starting a project hard and having to clear mental space, recalibrate, and become less involved in her everyday life.\n2. Relevance: The response accurately focuses on the challenges Bloom faces when starting a significant writing project, without deviating into irrelevant areas.\n3. Conciseness: The answer effectively summarizes the relevant information from the document, staying clear and to the point while avoiding unnecessary detail.\n\nFinal Result:'
- 'Reasoning:\n1. Context Grounding: The provided document does have a listing for a 6 bedroom detached house. The factual details in the answer given by the user are not present in the document.\n2. Relevance: The user’s answer lists a different price (£2,850,000 vs. £950,000) and an incorrect address (Highgate Lane, Leeds, Berkshire, RG12 vs Willow Drive, Twyford, Reading, Berkshire, RG10) than the provided document.\n3. Conciseness: The user’s answer, though concise, is factually incorrect based on the document.\n\nFinal Result:'
|
Evaluation
Metrics
| Label |
Accuracy |
| all |
0.8108 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wikisum_gpt-4o_cot-instructions_remove_final_evaluation_e1_larger_train_1")
preds = model("Reasoning: The given answer is well-supported by the provided document and includes details that are directly relevant to the identification of a funnel spider, such as the dark brown or black body and the presence of a hard, shiny carapace. It also mentions the two large fangs which is a key characteristic. The answer stays focused on the specific question of identifying a funnel spider, without deviating into unrelated topics.
Final Result:")
Training Details
Training Set Metrics
| Training set |
Min |
Median |
Max |
| Word count |
33 |
88.6482 |
198 |
| Label |
Training Sample Count |
| 0 |
95 |
| 1 |
104 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
| Epoch |
Step |
Training Loss |
Validation Loss |
| 0.0020 |
1 |
0.2432 |
- |
| 0.1004 |
50 |
0.256 |
- |
| 0.2008 |
100 |
0.2208 |
- |
| 0.3012 |
150 |
0.0894 |
- |
| 0.4016 |
200 |
0.0315 |
- |
| 0.5020 |
250 |
0.0065 |
- |
| 0.6024 |
300 |
0.0025 |
- |
| 0.7028 |
350 |
0.0022 |
- |
| 0.8032 |
400 |
0.002 |
- |
| 0.9036 |
450 |
0.002 |
- |
Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.1.1
- Transformers: 4.44.0
- PyTorch: 2.4.0+cu121
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}