wanglab
/

gogpt

@@ -29,15 +29,6 @@ datasets:
 GO-GPT is a decoder-only transformer model for predicting Gene Ontology (GO) terms from protein sequences. It combines ESM2 protein language model embeddings with an autoregressive decoder to generate GO term annotations across all three ontology aspects: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC).
-Unlike discriminative methods, GO-GPT treats GO prediction as a sequence generation task, capturing hierarchical and cross-aspect dependencies to achieve state-of-the-art weighted F_max of 0.65-0.70.
-| Component | Description |
-|-----------|-------------|
-| Protein Encoder | ESM2-3B (`facebook/esm2_t36_3B_UR50D`) |
-| Decoder | 12-layer GPT with prefix causal attention |
-| Total Parameters | ~3.2B (3B ESM2 + 200M decoder) |
 <div align="center" style="background-color: #1a1a2e; border: 2px solid #00B89E; border-radius: 12px; padding: 20px 30px; margin: 20px 0;">
   <h3 style="color: white;">🚀 Getting Started</h3>
   <p style="color: white;"><b>Check out the GitHub repo for full setup instructions and local inference guide:</b></p>
@@ -46,6 +37,13 @@ Unlike discriminative methods, GO-GPT treats GO prediction as a sequence generat
   </a>
 </div>
 **Training data:** [wanglab/gogpt-training-data](https://huggingface.co/datasets/wanglab/gogpt-training-data)

 GO-GPT is a decoder-only transformer model for predicting Gene Ontology (GO) terms from protein sequences. It combines ESM2 protein language model embeddings with an autoregressive decoder to generate GO term annotations across all three ontology aspects: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC).
 <div align="center" style="background-color: #1a1a2e; border: 2px solid #00B89E; border-radius: 12px; padding: 20px 30px; margin: 20px 0;">
   <h3 style="color: white;">🚀 Getting Started</h3>
   <p style="color: white;"><b>Check out the GitHub repo for full setup instructions and local inference guide:</b></p>
   </a>
 </div>
+Unlike discriminative methods, GO-GPT treats GO prediction as a sequence generation task, capturing hierarchical and cross-aspect dependencies to achieve state-of-the-art weighted F_max of 0.65-0.70.
+| Component | Description |
+|-----------|-------------|
+| Protein Encoder | ESM2-3B (`facebook/esm2_t36_3B_UR50D`) |
+| Decoder | 12-layer GPT with prefix causal attention |
+| Total Parameters | ~3.2B (3B ESM2 + 200M decoder) |
 **Training data:** [wanglab/gogpt-training-data](https://huggingface.co/datasets/wanglab/gogpt-training-data)