dv347
/

grammar-classifier

@@ -16,11 +16,13 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8715
 - Exact Match: 0.0
-- Micro F1: 0.3416
-- Macro F1: 0.0445
-- Hamming Accuracy: 0.8703
 ## Model description
@@ -39,29 +41,27 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-06
 - train_batch_size: 32
 - eval_batch_size: 64
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 0.2
-- num_epochs: 5
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Exact Match | Micro F1 | Macro F1 | Hamming Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:-----------:|:--------:|:--------:|:----------------:|
-| 0.4892        | 0.5   | 328  | 1.3449          | 0.0         | 0.3826   | 0.0421   | 0.8956           |
-| 0.5243        | 1.0   | 656  | 0.9100          | 0.0         | 0.3610   | 0.0367   | 0.8968           |
-| 0.5423        | 1.5   | 984  | 0.8660          | 0.0         | 0.3658   | 0.0484   | 0.8723           |
-| 0.5105        | 2.0   | 1312 | 0.8797          | 0.0         | 0.3385   | 0.0421   | 0.8740           |
-| 0.5304        | 2.5   | 1640 | 0.8877          | 0.0         | 0.3756   | 0.0464   | 0.8833           |
-| 0.5187        | 3.0   | 1968 | 0.9012          | 0.0         | 0.3827   | 0.0410   | 0.8981           |
-| 0.5122        | 3.5   | 2296 | 0.8840          | 0.0         | 0.3708   | 0.0458   | 0.8824           |
-| 0.5080        | 4.0   | 2624 | 0.8731          | 0.0         | 0.3919   | 0.0519   | 0.8776           |
-| 0.5403        | 4.5   | 2952 | 0.8727          | 0.0         | 0.3398   | 0.0443   | 0.8695           |
-| 0.5245        | 5.0   | 3280 | 0.8715          | 0.0         | 0.3416   | 0.0445   | 0.8703           |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.0967
 - Exact Match: 0.0
+- Micro F1: 0.3075
+- Macro F1: 0.0334
+- Hamming Accuracy: 0.8806
+- Avg Pred Positives: 34.0
+- Avg Gold Positives: 13.5736
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 2e-05
 - train_batch_size: 32
 - eval_batch_size: 64
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 64
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 0.2
+- num_epochs: 10
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Exact Match | Micro F1 | Macro F1 | Hamming Accuracy | Avg Pred Positives | Avg Gold Positives |
+|:-------------:|:-----:|:----:|:---------------:|:-----------:|:--------:|:--------:|:----------------:|:------------------:|:------------------:|
+| 0.2930        | 0.5   | 164  | 0.5663          | 0.0         | 0.3139   | 0.0288   | 0.9041           | 25.0               | 13.5736            |
+| 0.1928        | 1.0   | 328  | 0.2720          | 0.0         | 0.4392   | 0.0256   | 0.9460           | 13.0               | 13.5736            |
+| 0.0751        | 1.5   | 492  | 0.0559          | 0.0         | 0.5244   | 0.0234   | 0.9628           | 8.0                | 13.5736            |
+| 33.3599       | 2.0   | 656  | 13.5265         | 0.0         | 0.2931   | 0.0226   | 0.9114           | 21.0               | 13.5736            |
+| 19.0859       | 2.5   | 820  | 10.5864         | 0.0         | 0.2214   | 0.0302   | 0.8376           | 44.0               | 13.5736            |
+| 8.6145        | 3.0   | 984  | 4.0967          | 0.0         | 0.3075   | 0.0334   | 0.8806           | 34.0               | 13.5736            |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a94b77ae063bb226caaac9a70cba275eab900d8227cc136a6ddc27941f7d855f
 size 870738024

 version https://git-lfs.github.com/spec/v1
+oid sha256:fd8e1f4b5b9413d593b5ba27ab1c4bb8c6921bda8faec79056c481955b303dc6
 size 870738024