End of training

Browse files

Files changed (4) hide show

README.md +26 -24
config.json +41 -15
model.safetensors +2 -2
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 library_name: transformers
 license: apache-2.0
-base_model: distilbert-base-uncased
 tags:
 - generated_from_trainer
 metrics:
@@ -19,13 +19,15 @@ should probably proofread and complete it, then remove this comment. -->
 # results
-This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2282
-- Accuracy: 0.906
-- F1: 0.9052
-- Precision: 0.9048
-- Recall: 0.906
 ## Model description
@@ -48,30 +50,30 @@ The following hyperparameters were used during training:
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     | Precision | Recall |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
-| No log        | 1.0   | 79   | 0.2501          | 0.904    | 0.9035 | 0.9032    | 0.904  |
-| 0.3965        | 2.0   | 158  | 0.2282          | 0.906    | 0.9052 | 0.9048    | 0.906  |
-| 0.1925        | 3.0   | 237  | 0.2596          | 0.9      | 0.9    | 0.9       | 0.9    |
-| 0.1348        | 4.0   | 316  | 0.3635          | 0.89     | 0.8901 | 0.8903    | 0.89   |
-| 0.1348        | 5.0   | 395  | 0.4710          | 0.88     | 0.8834 | 0.8937    | 0.88   |
-| 0.0627        | 6.0   | 474  | 0.4220          | 0.894    | 0.8928 | 0.8923    | 0.894  |
-| 0.038         | 7.0   | 553  | 0.4292          | 0.898    | 0.8969 | 0.8964    | 0.898  |
-| 0.0204        | 8.0   | 632  | 0.4625          | 0.894    | 0.8941 | 0.8943    | 0.894  |
-| 0.0148        | 9.0   | 711  | 0.4741          | 0.896    | 0.8950 | 0.8945    | 0.896  |
-| 0.0148        | 10.0  | 790  | 0.4743          | 0.896    | 0.8952 | 0.8948    | 0.896  |
 ### Framework versions
-- Transformers 4.44.2
-- Pytorch 2.5.0+cu121
-- Datasets 3.1.0
-- Tokenizers 0.19.1

 ---
 library_name: transformers
 license: apache-2.0
+base_model: bert-base-uncased
 tags:
 - generated_from_trainer
 metrics:
 # results
+This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.8481
+- Accuracy: 0.425
+- F1: 0.4068
+- Precision: 0.4371
+- Recall: 0.425
+- Mse: 5.314
+- Mae: 1.37
 ## Model description
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     | Precision | Recall | Mse    | Mae   |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|:------:|:-----:|
+| 1.9914        | 1.0   | 157  | 1.7086          | 0.404    | 0.2561 | 0.3800    | 0.404  | 10.332 | 1.95  |
+| 1.5651        | 2.0   | 314  | 1.6295          | 0.419    | 0.3343 | 0.4048    | 0.419  | 7.397  | 1.591 |
+| 1.3878        | 3.0   | 471  | 1.6456          | 0.421    | 0.3666 | 0.4605    | 0.421  | 6.147  | 1.473 |
+| 1.1967        | 4.0   | 628  | 1.7054          | 0.42     | 0.3790 | 0.3598    | 0.42   | 5.874  | 1.44  |
+| 1.1002        | 5.0   | 785  | 1.7713          | 0.414    | 0.3896 | 0.3701    | 0.414  | 5.647  | 1.419 |
+| 0.9412        | 6.0   | 942  | 1.8481          | 0.425    | 0.4068 | 0.4371    | 0.425  | 5.314  | 1.37  |
+| 0.8737        | 7.0   | 1099 | 1.9534          | 0.407    | 0.4007 | 0.4025    | 0.407  | 5.141  | 1.375 |
+| 0.757         | 8.0   | 1256 | 2.0153          | 0.401    | 0.3932 | 0.3918    | 0.401  | 5.227  | 1.385 |
+| 0.6973        | 9.0   | 1413 | 2.0556          | 0.404    | 0.3979 | 0.4004    | 0.404  | 5.176  | 1.376 |
+| 0.6573        | 10.0  | 1570 | 2.0672          | 0.408    | 0.4008 | 0.4003    | 0.408  | 5.179  | 1.373 |
 ### Framework versions
+- Transformers 4.46.3
+- Pytorch 2.5.1+cu121
+- Datasets 3.2.0
+- Tokenizers 0.20.3

config.json CHANGED Viewed

@@ -1,25 +1,51 @@
 {
-  "_name_or_path": "distilbert-base-uncased",
-  "activation": "gelu",
   "architectures": [
-    "DistilBertForSequenceClassification"
   ],
-  "attention_dropout": 0,
-  "dim": 768,
-  "dropout": 0.1,
-  "hidden_dim": 3072,
   "initializer_range": 0.02,
   "max_position_embeddings": 512,
-  "model_type": "distilbert",
-  "n_heads": 12,
-  "n_layers": 6,
   "pad_token_id": 0,
   "problem_type": "single_label_classification",
-  "qa_dropout": 0.1,
-  "seq_classif_dropout": 0.2,
-  "sinusoidal_pos_embds": false,
-  "tie_weights_": true,
   "torch_dtype": "float32",
-  "transformers_version": "4.44.2",
   "vocab_size": 30522
 }

 {
+  "_name_or_path": "bert-base-uncased",
   "architectures": [
+    "BertForSequenceClassification"
   ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2",
+    "3": "LABEL_3",
+    "4": "LABEL_4",
+    "5": "LABEL_5",
+    "6": "LABEL_6",
+    "7": "LABEL_7",
+    "8": "LABEL_8",
+    "9": "LABEL_9"
+  },
   "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_2": 2,
+    "LABEL_3": 3,
+    "LABEL_4": 4,
+    "LABEL_5": 5,
+    "LABEL_6": 6,
+    "LABEL_7": 7,
+    "LABEL_8": 8,
+    "LABEL_9": 9
+  },
+  "layer_norm_eps": 1e-12,
   "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
   "pad_token_id": 0,
+  "position_embedding_type": "absolute",
   "problem_type": "single_label_classification",
   "torch_dtype": "float32",
+  "transformers_version": "4.46.3",
+  "type_vocab_size": 2,
+  "use_cache": true,
   "vocab_size": 30522
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9118ca147ab9269bedf71ff821d0d2c37eaae6297df4ffec7d7b3fbd2f5b2ffc
-size 267832560

 version https://git-lfs.github.com/spec/v1
+oid sha256:bb22add8717267db6ae44f80a22279062a6ec4439e3897548f553597633c87fc
+size 437983256

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d50397049eb65ce8bae6bf364635057828e3d21e1d7cd3fbad437deb89cb76f0
-size 5112

 version https://git-lfs.github.com/spec/v1
+oid sha256:8c2530b4b26f976a6c549f19eb5ada7a0fd13969a722ffe1c41476f0f80de978
+size 5240