Reaction-Enzyme Interaction Prediction Model

A deep learning framework for predicting reaction center atoms in chemical reactions using multimodal learning that combines molecular graph representations with enzyme sequence embeddings.

Overview

This project implements a multimodal neural network that predicts which atoms in a reactant molecule will participate in a reaction (reaction center prediction) by using:

Molecular Graphs: Graph transformer encoders for atom-level representations
Enzyme Sequences: ESM-2 protein language model embeddings
Cross-Attention Mechanism: Fuses molecular and enzyme information

Architecture

The model consists of three main components:

ESM-2 Encoder: Processes enzyme amino acid sequences using META FAIR's ESM-2 protein language model
Graph Transformer Encoder: Encodes molecular structures using graph attention layers
Cross-Attention Fusion: Combines enzyme and molecular representations to predict reaction centers

┌─────────────────┐     ┌─────────────────────┐
│   ESM-2 Model   │     │  Graph Transformer  │
│  (Protein Seq)  │     │   (Molecular Graph) │
└────────┬────────┘     └─────────┬───────────┘
         │                        │
         ▼                        ▼
┌─────────────────┐     ┌──────────────────┐
│  ESM Projection │     │  Atom Features   │
└────────┬────────┘     └─────────┬────────┘
         │                        │
         └──────────┬─────────────┘
                    │
                    ▼
         ┌─────────────────────┐
         │   Cross-Attention   │
         │      Mechanism      │
         └──────────┬──────────┘
                    │
                    ▼
         ┌─────────────────────┐
         │  Reaction Center    │
         │   Prediction Head   │
         └─────────────────────┘

Features

Multimodal Learning: Combines protein sequence and molecular graph data
Reaction Center Prediction: Identifies atoms involved in enzymatic reactions
Pre-trained Models: Uses ESM-2 for protein embeddings
Graph Transformers: Custom graph attention layers for molecular encoding
Interactive GUI: Streamlit-based web interface for predictions

Data Sources

The model is trained on data from:

Rhea: Expert-curated database of chemical reactions
RetroRules: Reaction rules for retrosynthesis
UniProt: Protein sequences and EC numbers

Training History

License

This project is licensed under Apache License 2.0 - Feel free to use and modify

Acknowledgments

ESM-2 - Evolutionary Scale Modeling
Rhea - Reaction database
RetroRules - Reaction rules database
RDKit - Cheminformatics toolkit
Hugging Face Transformers - Model implementations
PyTorch - Deep learning framework
Kimi AI with K2.5 - Assistance in generating code.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support