MulderFinders

Browse files

Files changed (5) hide show

README.md +17 -72
config.json +2 -2
model.safetensors +1 -1
tokenizer.json +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -9,83 +9,30 @@ metrics:
 model-index:
 - name: MulderFinders
   results: []
-datasets:
-- MorcuendeA/ConspiraText-ES
-language:
-- es
 ---
-![MulderFinders Logo](./i_want_to_belive.png)
-# MulderFinders
 # MulderFinders
-The truth is out there... and this model is here to help you find it.
-**MulderFinders** is a fine-tuned version of [EuroBERT/EuroBERT-210m](https://huggingface.co/EuroBERT/EuroBERT-210m), trained on [MorcuendeA/ConspiraText-ES](https://huggingface.co/datasets/MorcuendeA/ConspiraText-ES), a dataset full of Spanish-language conspiratorial and non-conspiratorial text. Whether it's aliens, 5G towers, or secret societies, this model is ready to classify them all.
-Trust no one... except maybe the F1 score.
-## Usage
-You can use the model directly with the 🤗 Transformers library:
-```python
-  from transformers import AutoTokenizer, AutoModelForSequenceClassification
-  import torch
-  model_name = "MorcuendeA/MulderFinders"
-  tokenizer = AutoTokenizer.from_pretrained(model_name)
-  model = AutoModelForSequenceClassification.from_pretrained(model_name, trust_remote_code=True)
-  text = "las redes 5G nos ayudan a tener mejor internet"
-  inputs = tokenizer(text, return_tensors="pt")
-  outputs = model(**inputs)
-  logits = outputs.logits
-  probs = torch.softmax(logits, dim=1)  [0]
-  labels = model.config.id2label
-  pred = torch.argmax(probs).item()
-  print(f"Prediction: {labels[pred]} ({probs[pred].item():.4f})")
-  # Output:
-  # Prediction: rational (0.9989)
-```
 It achieves the following results on the evaluation set:
-- Loss: 0.0004
-- Accuracy: 1.0
-- F1 Score: 1.0
 ## Model description
-Model description
-**MulderFinders** is a Spanish-language text classification model fine-tuned to detect conspiracy-related content. It is based on [EuroBERT/EuroBERT-210m](https://huggingface.co/EuroBERT/EuroBERT-210m), a transformer model pre-trained on multiple European languages. MulderFinders performs binary classification, identifying whether a given piece of text expresses conspiratorial ideas or not.
 ## Intended uses & limitations
-**Intended uses:**
-- Content moderation on social media or online forums.
-- Research and analysis of conspiratorial discourse in Spanish-language texts.
-- Assisting fact-checking workflows by flagging potentially conspiratorial statements.
-**Limitations:**
-- May not handle sarcasm, irony, or ambiguous language reliably.
-- Performance outside the original domain (i.e., texts similar to the training dataset) may degrade.
-- May reflect biases present in the training data.
 ## Training and evaluation data
-The model was fine-tuned using the [ConspiraText-ES](https://huggingface.co/datasets/MorcuendeA/ConspiraText-ES) dataset, which contains Spanish-language examples labeled as conspiratorial or not. The dataset includes only synthetic text samples, covering various conspiracy-related themes.
-During fine-tuning, regularization was applied with **attention_dropout** and **hidden_dropout** both set to 0.3.
 ## Training procedure
@@ -106,18 +53,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy | F1 Score |
 |:-------------:|:------:|:----:|:---------------:|:--------:|:--------:|
-| 0.1365        | 0.3030 | 20   | 0.0282          | 0.9924   | 0.9927   |
-| 0.0633        | 0.6061 | 40   | 0.1290          | 0.9773   | 0.9774   |
-| 0.0362        | 0.9091 | 60   | 0.0390          | 0.9962   | 0.9963   |
-| 0.0271        | 1.2121 | 80   | 0.0284          | 0.9962   | 0.9963   |
-| 0.0001        | 1.5152 | 100  | 0.0079          | 0.9962   | 0.9963   |
-| 0.0026        | 1.8182 | 120  | 0.0322          | 0.9962   | 0.9963   |
 ### Framework versions
-- Transformers 4.53.2
 - Pytorch 2.6.0+cu124
-- Datasets 2.14.4
-- Tokenizers 0.21.2

 model-index:
 - name: MulderFinders
   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
 # MulderFinders
+This model is a fine-tuned version of [EuroBERT/EuroBERT-210m](https://huggingface.co/EuroBERT/EuroBERT-210m) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0059
+- Accuracy: 0.9981
+- F1 Score: 0.9983
 ## Model description
+More information needed
 ## Intended uses & limitations
+More information needed
 ## Training and evaluation data
+More information needed
 ## Training procedure
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy | F1 Score |
 |:-------------:|:------:|:----:|:---------------:|:--------:|:--------:|
+| 0.2601        | 0.3030 | 20   | 0.0532          | 0.9848   | 0.9855   |
+| 0.0771        | 0.6061 | 40   | 0.0197          | 0.9981   | 0.9982   |
+| 0.0271        | 0.9091 | 60   | 0.0218          | 0.9981   | 0.9982   |
+| 0.0189        | 1.2121 | 80   | 0.0182          | 0.9943   | 0.9945   |
+| 0.0176        | 1.5152 | 100  | 0.0093          | 0.9962   | 0.9963   |
 ### Framework versions
+- Transformers 4.54.0
 - Pytorch 2.6.0+cu124
+- Datasets 4.0.0
+- Tokenizers 0.21.2

config.json CHANGED Viewed

@@ -3,7 +3,7 @@
     "EuroBertForSequenceClassification"
   ],
   "attention_bias": false,
-  "attention_dropout": 0.3,
   "auto_map": {
     "AutoConfig": "configuration_eurobert.EuroBertConfig",
     "AutoModel": "modeling_eurobert.EuroBertModel",
@@ -19,7 +19,7 @@
   "eos_token_id": 128001,
   "head_dim": 64,
   "hidden_act": "silu",
-  "hidden_dropout": 0.3,
   "hidden_size": 768,
   "id2label": {
     "0": "rational",

     "EuroBertForSequenceClassification"
   ],
   "attention_bias": false,
+  "attention_dropout": 0.2,
   "auto_map": {
     "AutoConfig": "configuration_eurobert.EuroBertConfig",
     "AutoModel": "modeling_eurobert.EuroBertModel",
   "eos_token_id": 128001,
   "head_dim": 64,
   "hidden_act": "silu",
+  "hidden_dropout": 0.2,
   "hidden_size": 768,
   "id2label": {
     "0": "rational",

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9c6d569960e50f952ac73dc824edee37878799713de4fee344cbc575b741918e
 size 849445112

 version https://git-lfs.github.com/spec/v1
+oid sha256:57ebd62773f7092cfe0f5bb70bd3d2e849fab7643ab713b10a74dd2799e37f1b
 size 849445112

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bf5dc94ee8165749c233582f839e98776e7ad895f506dcea7556d68ba375ab73
-size 17210345

 version https://git-lfs.github.com/spec/v1
+oid sha256:98d4a1d32152d6cedf85b5e88f3b205106dca1fe72aaab34e0ac13c238421069
+size 17210235

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d209ad7a782f8bd52d93c64d8cfe3272215ced7a889639a474cfc3b0b88c0325
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:e9bf8330025037b42d854a928006d6f6f6e6f07b712e41b90aaf441e4ca29cb5
 size 5304