IEETA
/

BioNExt

Model card Files Files and versions

BioNExt / README.md

richardjonker2000's picture

richardjonker2000

Update README.md

77802e0 verified almost 2 years ago

|

history blame contribute delete

3.09 kB

	---
	license: mit
	datasets:
	- bigbio/biored
	language:
	- en
	metrics:
	- f1
	---


	# Model Card for BioNExt

	BioNExt, is an end-to-end Biomedical Relation Extraction and Classifcation system. The work utilized three modules, a Tagger (Named Entity Recognition), Linker (Entity Linking) and an Extractor (Relation Extraction and Classification).

	This repositories contains two models:

	1. Tagger: Named Entity Recognition module, which performs 6 class biomedical NER: Genes, Diseases, Chemicals, Variants (mutations), Species, and Cell Lines.
	2. Extractor: Performs Relation Extraction and classification. The classes for the relation Extraction are: Positive Correlation, Negative Correlation, Association, Binding, Drug Interaction, Cotreatment, Comparison, and Conversion.

	For a full description on how to utilize our end-to-end pipeline we point you towards our [GitHub](https://github.com/ieeta-pt/BioNExt) repository.


	- Developed by: IEETA
	- Model type: BERT Base
	- Language(s) (NLP): English
	- License: MIT
	- Finetuned from model: BioLinkBERT-Large

	### Model Sources

	- Repository: [IEETA BioNExt GitHub](https://github.com/ieeta-pt/BioNExt)
	- Paper: Towards Discovery: An End-to-End System for Uncovering Novel Biomedical Relations [Awaiting Publication]

	Authors:
	- Tiago Almeida ([ORCID: 0000-0002-4258-3350](https://orcid.org/0000-0002-4258-3350))
	- Richard A A Jonker ([ORCID: 0000-0002-3806-6940](https://orcid.org/0000-0002-3806-6940))
	- Rui Antunes ([ORCID: 0000-0003-3533-8872](https://orcid.org/0000-0003-3533-8872))
	- João R Almeida ([ORCID: 0000-0003-0729-2264](https://orcid.org/0000-0003-0729-2264))
	- Sérgio Matos ([ORCID: 0000-0003-1941-3983](https://orcid.org/0000-0003-1941-3983))


	## Uses

	Note we do not take any liability for the use of the model in any professional/medical domain. The model is intended for academic purposes only.

	## How to Get Started with the Model

	Please refer to our GitHub repository for more information on our end-to-end inference pipeline: [IEETA BioNExt GitHub](https://github.com/ieeta-pt/BioNExt)


	## Training Data

	The training data utilized was the BioRED corpus, wihtin the scope of the BioCreative-VIII challenge.

	Ling Luo, Po-Ting Lai, Chih-Hsuan Wei, Cecilia N Arighi, Zhiyong Lu, BioRED: a rich biomedical relation extraction dataset, Briefings in Bioinformatics, Volume 23, Issue 5, September 2022, bbac282, https://doi.org/10.1093/bib/bbac282


	## Results

	As evaluated as an end to end system, our results are as follows:
	- Tagger: 43.10
	- Linker: 32.46
	- Extractor: 24.59

	\| Configuration \| Entity Pair (P/R/F%) \| + Relation (P/R/F%) \| + Novel (P/R/F%) \|
	\|---------------------------------------\|-----------------------\|----------------------\|------------------\|
	\| Competition best \| -/-/55.84 \| -/-/43.03 \| -/-/32.75 \|
	\| BioNExt (end-to-end) \| 45.89/40.63/43.10 \| 34.56/30.60/32.46 \| 26.18/23.18/24.59 \|


	## Citation

	BibTeX:

	[Awaiting Publication]