SetFit with BAAI/bge-base-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
| Label |
Examples |
| 1 |
- "The answer provided is comprehensive and directly addresses the question. Here is the reasoning:\n\n1. Context Grounding: The answer precisely matches the details provided in the document. Patricia Wallace's various roles, including managing a clothing closet, overseeing a food pantry, coordinating the food backpack program, and leading the Intervention Support Team, are well-supported by the text.\n \n2. Relevance: The answer is entirely relevant to the question, as it lists the specific roles and responsibilities of Patricia Wallace at Oak View Elementary as outlined in the document.\n\n3. Conciseness: The answer is clear and focused, listing the relevant roles and responsibilities without unnecessaryinformation.\n\nTherefore, the evaluation is:"
- "Reasoning:\n1. Context Grounding: The answer is well supported by the document provided. It details the necessary steps to administer a saline solution to a baby, which is a method found within the source text.\n2. Relevance: The answer focuses specifically on treating a baby's cough, directly addressing the question asked.\n3. Conciseness: The answer is clear and to the point, providing concrete steps without unnecessary information. However, the answer could have been made even more concise by avoiding repetitions about the saline solution preparation.\n\nFinal evaluation:"
- 'Reasoning:\n1. Context Grounding: The answer provided accurately reflects the information in the document, describing the symptoms, risk factors, and necessary actions if toxic shock syndrome (TSS) is suspected.\n2. Relevance: The answer directly addresses the question asked, focusing on how to recognize TSS and what to do if you suspect you have it.\n3. Conciseness: The answer effectively condenses the necessary information into a coherent, straightforward explanation without extraneous details.\n\nFinal Evaluation:'
|
| 0 |
- 'Evaluation:\nThe answer provided incorrectly identifies the creation of a "literature hall" instead of a "science hall" as mentioned in the document. The answer also correctly attributes the oversight to Fr. Zahm, but this information is related to the wrong type of hall as per the document.\n\n1. Context Grounding: The document specifically states that a "Science Hall" was built under the direction of Fr. Zahm in 1883, not a literature hall.\n2. Relevance: The answer partially addresses the question correctly by mentioning Fr. Zahm, but it misidentifies the type of hall constructed.\n3. Conciseness: The answer is concise but includesincorrect information.\n\nThe final evaluation:'
- 'Reasoning:\n1. Context Grounding: The document supports that Gregory Johnson is the CEO of Franklin Templeton Investments and provides sufficient context about his role and relation to the company.\n2. Relevance: The answer directly addresses the question about the CEO of Franklin Templeton Investments.\n3. Conciseness: The answer presents the information clearly and succinctly without unnecessary details.\n\nFinal Result: ****'
- 'The answer correctly identifies that retired priests and brothers live at Fatima House. However, the additional information about the rare collection of ancient religious manuscripts at Fatima House is not supported by the document, making it an irrelevant addition. This deviates from the principle of conciseness and relevance to the specific question asked.\n\nFinal evaluation:'
|
Evaluation
Metrics
| Label |
Accuracy |
| all |
0.8649 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wikisum_gpt-4o_cot-few_shot-instructions_remove_final_evaluation_e1_large")
preds = model("Reasoning:
1. Context Grounding: The provided answer reflects the key identifiers of a funnel spider found in the document, such as the dark brown or black body, hard shiny carapace, and large fangs.
2. Relevance: The answer directly addresses the question of how to identify a funnel spider with relevant details.
3. Conciseness: The answer is clear, with pertinent details, and avoids unnecessary information.
Final evaluation:")
Training Details
Training Set Metrics
| Training set |
Min |
Median |
Max |
| Word count |
11 |
76.2020 |
196 |
| Label |
Training Sample Count |
| 0 |
94 |
| 1 |
104 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
| Epoch |
Step |
Training Loss |
Validation Loss |
| 0.0020 |
1 |
0.2119 |
- |
| 0.1010 |
50 |
0.255 |
- |
| 0.2020 |
100 |
0.1703 |
- |
| 0.3030 |
150 |
0.0611 |
- |
| 0.4040 |
200 |
0.0351 |
- |
| 0.5051 |
250 |
0.0197 |
- |
| 0.6061 |
300 |
0.0172 |
- |
| 0.7071 |
350 |
0.0109 |
- |
| 0.8081 |
400 |
0.0108 |
- |
| 0.9091 |
450 |
0.0072 |
- |
Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.1.1
- Transformers: 4.44.0
- PyTorch: 2.4.0+cu121
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}