# Model Card for h10505jd-a63140nd-ED-Opt-B This is a sequence relation classification model that was trained to detect whether a given piece of evidence is relevant to a given claim. ## Model Details ### Model Description This model addresses the Evidence Detection (ED) shared task: given a claim and a piece of evidence, determine if the evidence is relevant to that claim (binary classification). This model has a Bert preprocessor and encoder, that has not been fine-tuned, that feed into a multi layered BLSTM model with self-attention mechanism that was fine-tuned on 21K pairs of texts. The input sequences are concatenated to form a larger input sequence, with each sequence preceded by "CLAIM:" and "EVIDENCE:" respectively. - **Developed by:** James Deslandes and Nikolaos Douranos - **Language(s):** English - **Model type:** Supervised - **Model architecture:** BLSTM ### Model Resources - **Preprocessor:** "https://kaggle.com/models/tensorflow/bert/TensorFlow2/en-uncased-preprocess/3" - **Encoder Model:** https://www.kaggle.com/models/tensorflow/bert/TensorFlow2/en-uncased-l-12-h-768-a-12/4 - **Repo:** https://huggingface.co/Jed612/encoder-BLSTM ## Training Details ### Training Data This model was trained on 21K claim-evidence pairs. ### Training Procedure #### Training Hyperparameters - batch_size: 32 - epochs: 4 - learning_rate: 1e-4 #### Speeds, Sizes, Times - overall training time: 16 minutes - duration per training epoch: 4 minutes - model size: 500MB ## Evaluation ### Testing Data & Metrics #### Testing Data A seperate validation dataset of 6K claim-evidence pairs. #### Metrics - ROC AUC - Specificity - Precision - Recall - F1-score - Accuracy - average accuracy over 4 models ### Results The model obtained an ROC AUC of 0.91, a specificity of 92.8%, a precision of 78.1% a recall of 66.6%, an F1-score of 71.9% and an accuracy of 85.6%. Four different models with this structure were trained and their accuracies averaged to 85.4%. The error bars show twice the standard deviation, either side of the mean. **Training and Validation Accuracy and Loss Mean:** ![Graph of Training and Validation Accuracy and Loss Mean](https://external-content.duckduckgo.com/iu/?u=http%3A%2F%2Fdrive.google.com/uc?id=1gi_5a4mfwzQae6J1dX_IXf7p_C5knWeQ) ## Technical Specifications ### Hardware - RAM: at least 4 GB - Storage: at least 50 GB, - GPU: T4 ### Software - Tensorflow - Tensorflow_hub - Keras 2 ## Bias, Risks, and Limitations Any inputs (concatenation of two sequences) longer than 512 subwords will be truncated by the model. ## Additional Information The hyperparameters were determined by experimentation with different values.