# Model Card for h10505jd-a63140nd-ED-Opt-B

<!-- Provide a quick summary of what the model is/does. -->

This is a sequence relation classification model that was trained to
      detect whether a given piece of evidence is relevant to a given claim.


## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

This model addresses the Evidence Detection (ED) shared task: given a claim and a piece of evidence, determine if the evidence is relevant to that claim (binary classification). This model has a Bert preprocessor and encoder, that has not been fine-tuned, that feed into a multi layered BLSTM model with self-attention mechanism that was fine-tuned on 21K pairs of texts. The input sequences are concatenated to form a larger input sequence, with each sequence preceded by "CLAIM:" and "EVIDENCE:" respectively.

- **Developed by:** James Deslandes and Nikolaos Douranos
- **Language(s):** English
- **Model type:** Supervised
- **Model architecture:** BLSTM

### Model Resources

<!-- Provide links where applicable. -->

- **Preprocessor:** "https://kaggle.com/models/tensorflow/bert/TensorFlow2/en-uncased-preprocess/3"
- **Encoder Model:** https://www.kaggle.com/models/tensorflow/bert/TensorFlow2/en-uncased-l-12-h-768-a-12/4
- **Repo:** https://huggingface.co/Jed612/encoder-BLSTM

## Training Details

### Training Data

This model was trained on 21K claim-evidence pairs.

### Training Procedure

#### Training Hyperparameters

      - batch_size: 32
      - epochs: 4
      - learning_rate: 1e-4

#### Speeds, Sizes, Times

      - overall training time: 16 minutes
      - duration per training epoch: 4 minutes
      - model size: 500MB

## Evaluation

### Testing Data & Metrics

#### Testing Data

A seperate validation dataset of 6K claim-evidence pairs.

#### Metrics

      - ROC AUC
      - Specificity
      - Precision
      - Recall
      - F1-score
      - Accuracy
      - average accuracy over 4 models

### Results

The model obtained an ROC AUC of 0.91, a specificity of 92.8%, a precision of 78.1% a recall of 66.6%, an F1-score of 71.9% and an accuracy of 85.6%. Four different models with this structure were trained and their accuracies averaged to 85.4%. The error bars show twice the standard deviation, either side of the mean.

**Training and Validation Accuracy and Loss Mean:**

![Graph of Training and Validation Accuracy and Loss Mean](https://external-content.duckduckgo.com/iu/?u=http%3A%2F%2Fdrive.google.com/uc?id=1gi_5a4mfwzQae6J1dX_IXf7p_C5knWeQ)

## Technical Specifications

### Hardware


      - RAM: at least 4 GB
      - Storage: at least 50 GB,
      - GPU: T4

### Software

      - Tensorflow
      - Tensorflow_hub
      - Keras 2

## Bias, Risks, and Limitations

Any inputs (concatenation of two sequences) longer than
      512 subwords will be truncated by the model.

## Additional Information

The hyperparameters were determined by experimentation
      with different values.