Reaction-Enzyme Interaction Prediction Model

A deep learning framework for predicting reaction center atoms in chemical reactions using multimodal learning that combines molecular graph representations with enzyme sequence embeddings.

Overview

This project implements a multimodal neural network that predicts which atoms in a reactant molecule will participate in a reaction (reaction center prediction) by using:

  • Molecular Graphs: Graph transformer encoders for atom-level representations
  • Enzyme Sequences: ESM-2 protein language model embeddings
  • Cross-Attention Mechanism: Fuses molecular and enzyme information

Architecture

The model consists of three main components:

  1. ESM-2 Encoder: Processes enzyme amino acid sequences using META FAIR's ESM-2 protein language model
  2. Graph Transformer Encoder: Encodes molecular structures using graph attention layers
  3. Cross-Attention Fusion: Combines enzyme and molecular representations to predict reaction centers
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   ESM-2 Model   β”‚     β”‚  Graph Transformer  β”‚
β”‚  (Protein Seq)  β”‚     β”‚   (Molecular Graph) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                        β”‚
         β–Ό                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  ESM Projection β”‚     β”‚  Atom Features   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                        β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚   Cross-Attention   β”‚
         β”‚      Mechanism      β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚  Reaction Center    β”‚
         β”‚   Prediction Head   β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Features

  • Multimodal Learning: Combines protein sequence and molecular graph data
  • Reaction Center Prediction: Identifies atoms involved in enzymatic reactions
  • Pre-trained Models: Uses ESM-2 for protein embeddings
  • Graph Transformers: Custom graph attention layers for molecular encoding
  • Interactive GUI: Streamlit-based web interface for predictions

Data Sources

The model is trained on data from:

  • Rhea: Expert-curated database of chemical reactions
  • RetroRules: Reaction rules for retrosynthesis
  • UniProt: Protein sequences and EC numbers

Training History

Placeholder for training_history

License

This project is licensed under Apache License 2.0 - Feel free to use and modify

Acknowledgments

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support