SetFit with BAAI/bge-base-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
| Label |
Examples |
| 0 |
- "Reasoning:\n1. Context Grounding: The answer aligns well with the provided document, specifically discussing coach Brian Shaw's influence and changes in the team strategy, which are mentioned in the text.\n2. Relevance: The response directly addresses the question by focusing on the reasons behind the Nuggets' offensive success in January, such as the new gameplay strategy advocated by the coach and increased comfort and effectiveness.\n3. Conciseness: The answer is mostly concise but adds an unsubstantiated point about virtual reality training, which is not mentioned in the document and should be excluded to maintain briefing relevance.\n\nFinal result: ****."
- "Reasoning:\n1. Context Grounding: The answer effectively uses specific details from the provided document, discussing the author's experience with digital and film photography, and technical differences such as how each medium handles exposure and color capture.\n2. Relevance: The answer is directly relevant to the question, enumerating specific differences mentioned by the author.\n3. Conciseness: While mostly concise, the answer could have been slightly more succinct. However, it largely avoids unnecessary information and remains clear and to the point.\n\nFinal Result:"
- "Reasoning:\n\n1. Context Grounding: The answer given details the results of a mixed martial arts event, specifically highlighting Antonio Rogerio Nogueira's victory. However, the question asks about the main conflict in the third book of the Arcana Chronicles by Kresley Cole. There is no relevance in the provided document or the answer to the Arcana Chronicles.\n2. Relevance: The answer does not address the asked question at all. Instead, it provides information about an MMA fight, which is entirely unrelated to the Arcana Chronicles.\n3. Conciseness: While the answer is concise, it fails to answer the appropriate question, thus making its conciseness irrelevant in this context.\n\nFinal Result:"
|
| 1 |
- 'Reasoning:\n\n1. Context Grounding: The answer provided is well-supported by the document and grounded in the text, which discusses best practices for web designers to avoid unnecessary revisions and conflicts. It specifically addresses parts of the document that highlight getting to know the client, signing a contract, and being honest and diplomatic.\n \n2. Relevance: The answer directly addresses the question of best practices a web designer can incorporate into their client discovery and web design process. It does not deviate into unrelated topics and remains relevant throughout.\n\n3. Conciseness: The answer is clear and concise. It covers the main points without unnecessary elaboration or inclusion of extraneous information.\n\nFinal Result:'
- "Reasoning:\n\n1. Context Grounding: The answer provided is well-supported by the document. The document discusses the importance of drawing from one's own experiences, particularly those involving pain and emotion, in order to create genuine and relatable characters.\n2. Relevance: The answer directly addresses the question of what the author believes is the key to creating a connection between the reader and the characters.\n3. Conciseness: The answer is clear and to the point, avoiding unnecessary information.\n\nFinal Result:"
- 'Reasoning:\n1. Context Grounding: The answer directly refers to the document, which mentions Mauro Rubin as the CEO of JoinPad during the event.\n2. Relevance: The answer specifically addresses the question asked about the CEO of JoinPad during the event.\n3. Conciseness: The answer is clear, direct, and does not include unnecessary information.\n\nFinal result:'
|
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_gpt-4o_cot-instructions_remove_final_evaluation_e1_one_big_model_17270799")
preds = model("Reasoning:
1. Context Grounding: The answer provided is directly supported by the document, which states, \"Allan Cox's First Class Delivery on a H128-10W for his Level 1 certification flight.\"
2. Relevance: The answer directly addresses the specific question asked, detailing the rocket and the motor used for Allan Cox's Level 1 certification flight.
3. Conciseness: The answer is clear and to the point, without any extraneous information.
Final result:")
Training Details
Training Set Metrics
| Training set |
Min |
Median |
Max |
| Word count |
32 |
88.2983 |
198 |
| Label |
Training Sample Count |
| 0 |
200 |
| 1 |
209 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
| Epoch |
Step |
Training Loss |
Validation Loss |
| 0.0010 |
1 |
0.161 |
- |
| 0.0489 |
50 |
0.2637 |
- |
| 0.0978 |
100 |
0.2513 |
- |
| 0.1466 |
150 |
0.151 |
- |
| 0.1955 |
200 |
0.1002 |
- |
| 0.2444 |
250 |
0.0596 |
- |
| 0.2933 |
300 |
0.0383 |
- |
| 0.3421 |
350 |
0.0236 |
- |
| 0.3910 |
400 |
0.0121 |
- |
| 0.4399 |
450 |
0.0075 |
- |
| 0.4888 |
500 |
0.0046 |
- |
| 0.5376 |
550 |
0.0031 |
- |
| 0.5865 |
600 |
0.0029 |
- |
| 0.6354 |
650 |
0.0031 |
- |
| 0.6843 |
700 |
0.0017 |
- |
| 0.7331 |
750 |
0.0016 |
- |
| 0.7820 |
800 |
0.0014 |
- |
| 0.8309 |
850 |
0.0013 |
- |
| 0.8798 |
900 |
0.0014 |
- |
| 0.9286 |
950 |
0.0015 |
- |
| 0.9775 |
1000 |
0.0014 |
- |
Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.1.1
- Transformers: 4.44.0
- PyTorch: 2.4.0+cu121
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}