SetFit

This is a SetFit model that can be used for Text Classification. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Classification head: a LogisticRegression instance
  • Maximum Sequence Length: 128 tokens
  • Number of Classes: 2 classes

Model Sources

Model Labels

Label Examples
irrelevant
  • 'RT : ส้มมาถูกทางแล้วลูก ใช้ใจแลกใจไปเลย ยิ่งในตจว. ยิ่งต้องทำให้คนในพื้นที่เห็นหน้าเห็นตาเราบ่อย ๆ ชาวบ้านเปนงี้กันจริงมึง ไ…'
  • 'Esa información puede venir de Koeman y su entorno, de Messi y su entorno o del club vía una conversación con los dos anteriores. Conociendo a los periodistas, pinta más a un Koeman habla con un amigo rollo Bakero, de ahí a la junta y presionamos a Messi dejándolo de mali'
  • 'RT : 🎮 2 Perfect Match controllers\u200b 🧢 2 Perfect Match hats 🏈 Game codes for Madden 26, EAFC 26 and NBA 2K26\u200b Like & comment with Per…'
relevant
  • "Coup d'Etat au Mali: La Cédéao condamne et ferme les frontières du pays"
  • "Coup d'État au Mali : Umaro Sissoco Embaló fait son show à la Cedeao"
  • "Ibk a déjà rendu sa démission, que la Cedeo se contente d'accompagner la population et la junte afin que le pays avance"

Evaluation

Metrics

Label Accuracy
all 0.9583

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("beethogedeon/coup-detat-tweets-relevancy-classifier-MiniLM-L12-v2")
# Run inference
preds = model("Interesting")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 20.2031 51
Label Training Sample Count
irrelevant 32
relevant 32

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0125 1 0.3861 -
0.625 50 0.1327 -
1.25 100 0.0046 -
1.875 150 0.0016 -
2.5 200 0.0011 -

Framework Versions

  • Python: 3.12.4
  • SetFit: 1.1.3
  • Sentence Transformers: 5.3.0
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu129
  • Datasets: 4.7.0
  • Tokenizers: 0.22.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
23
Safetensors
Model size
33.4M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for beethogedeon/coup-detat-tweets-relevancy-classifier-MiniLM-L12-v2

Evaluation results