encoder-BLSTM / README.md
Jed612's picture
Create README.md
3e3cb7d verified

Model Card for h10505jd-a63140nd-ED-Opt-B

This is a sequence relation classification model that was trained to detect whether a given piece of evidence is relevant to a given claim.

Model Details

Model Description

This model addresses the Evidence Detection (ED) shared task: given a claim and a piece of evidence, determine if the evidence is relevant to that claim (binary classification). This model has a Bert preprocessor and encoder, that has not been fine-tuned, that feed into a multi layered BLSTM model with self-attention mechanism that was fine-tuned on 21K pairs of texts. The input sequences are concatenated to form a larger input sequence, with each sequence preceded by "CLAIM:" and "EVIDENCE:" respectively.

  • Developed by: James Deslandes and Nikolaos Douranos
  • Language(s): English
  • Model type: Supervised
  • Model architecture: BLSTM

Model Resources

Training Details

Training Data

This model was trained on 21K claim-evidence pairs.

Training Procedure

Training Hyperparameters

  - batch_size: 32
  - epochs: 4
  - learning_rate: 1e-4

Speeds, Sizes, Times

  - overall training time: 16 minutes
  - duration per training epoch: 4 minutes
  - model size: 500MB

Evaluation

Testing Data & Metrics

Testing Data

A seperate validation dataset of 6K claim-evidence pairs.

Metrics

  - ROC AUC
  - Specificity
  - Precision
  - Recall
  - F1-score
  - Accuracy
  - average accuracy over 4 models

Results

The model obtained an ROC AUC of 0.91, a specificity of 92.8%, a precision of 78.1% a recall of 66.6%, an F1-score of 71.9% and an accuracy of 85.6%. Four different models with this structure were trained and their accuracies averaged to 85.4%. The error bars show twice the standard deviation, either side of the mean.

Training and Validation Accuracy and Loss Mean:

Graph of Training and Validation Accuracy and Loss Mean

Technical Specifications

Hardware

  - RAM: at least 4 GB
  - Storage: at least 50 GB,
  - GPU: T4

Software

  - Tensorflow
  - Tensorflow_hub
  - Keras 2

Bias, Risks, and Limitations

Any inputs (concatenation of two sequences) longer than 512 subwords will be truncated by the model.

Additional Information

The hyperparameters were determined by experimentation with different values.