OPI_logo

Github:

https://github.com/baaihealth/opi

Paper:

OPI: An Open Instruction Dataset for Adapting Large Language Models to Protein-Related Tasks has been accepted by NeurIPS 2024 Workshop: Foundation Models for Science: Progress, Opportunities, and Challenges.

Model Card of OPI-Galactica-6.7B

OPI-Galactica-6.7B was fine-tuned from the Galactica-6.7B model using the complete OPI training set (i.e.,OPI_full_1.61M_train.json). For more details of training and testing, please visit https://github.com/baaihealth/opi.

Overview

Evaluation of OPI-Galactica-6.7B model on 9 tasks

Each testing result is derived from the Galactica-6.7B model that has been fine-tuned using OPI_full_1.61M.json and subsequently evaluated on the respective testing set for each specific task.

Task Type Task Name Testing file Accuracy Precision Recall F1 Rouge-L
Sequence Understanding EC Number Prediction (split100) CLEAN_EC_number_new_test - 0.2700 0.2663 0.2596 -
CLEAN_EC_number_price_test - 0.0268 0.0268 0.0268 -
Fold Type Prediction fold_type_test_Fold_Holdout 0.0808 - - - -
fold_type_test_Superfamily_Holdout 0.1348 - - - -
fold_type_test_Family_Holdout 0.4854 - - - -
Subcellular Localization Prediction subcell_loc_test 0.7771 - - - -
Annotation Prediction Function Keywords Prediction CASPSimilarSeq_keywords_test - 0.8120 0.7360 0.7643 -
Function Keywords Prediction IDFilterSeq_keywords_test - 0.8377 0.8019 0.8070 -
Function Keywords Prediction UniProtSeq_keywords_test - 0.8596 0.8196 0.8276 -
Gene Ontology (GO) Terms Prediction CASPSimilarSeq_go_terms_test - 0.7613 0.7492 0.7476 -
Gene Ontology (GO) Terms Prediction IDFilterSeq_go_terms_test - 0.7404 0.7274 0.7207 -
Gene Ontology (GO) Terms Prediction UniProtSeq_go_terms_test - 0.7638 0.7373 0.7358 -
Function Description Prediction CASPSimilarSeq_function_test - - - - 0.7430
Function Description Prediction IDFilterSeq_function_test - - - - 0.7014
Function Description Prediction UniProtSeq_function_test - - - - 0.7133
Knowledge Mining Tissue Location Prediction from Gene Symbol gene_symbol_to_tissue_test - 0.3917 0.9077 0.5303 -
Cancer Prediction from Gene Symbol gene_symbol_to_cancer_test - 0.3555 0.3189 0.3229 -
Cancer Prediction from Gene Name gene_name_to_cancer_test - 0.2728 0.2554 0.2533 -

Prediction comparison with SOTA mdoels

model_compare model_compare model_compare model_compare model_compare model_compare model_compare model_compare model_compare

Demo

We use the FastChat platform to visually demonstrate the ability of OPI-Galactica-6.7B model on various evaluation tasks.

OPI Demo

Downloads last month
23
Safetensors
Model size
7B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for BAAI/OPI-Galactica-6.7B

Finetuned
(3)
this model

Dataset used to train BAAI/OPI-Galactica-6.7B

Space using BAAI/OPI-Galactica-6.7B 1