MassSpecGym Ranker: Contrastive Spectral Transformer

This model is a Spectral Transformer trained for high-precision molecular identification from MS/MS spectra. It is part of the MassSpecGym benchmark.

Model Details

Architecture: Transformer Encoder (2 layers, 4 heads) with Fourier Feature m/z encoding.
Objective: Contrastive Learning (InfoNCE) with a temperature of 0.1.
Input: MS/MS fragment peaks (m/z and intensity) + Precursor mass.
Output: 4096-dimensional molecular embedding for candidate ranking.

Performance (MassSpecGym Test Set)

The model significantly outperforms standard MLP baselines:

Hit@1: 8.38%
Hit@5: 20.35%
Hit@20: 42.00%

Key Features

Fourier Features: Captures high-precision mass differences essential for isotope identification.
Precursor Injection: Provides global context to every spectral peak.
Attention Pooling: Dynamically weights diagnostic peaks while down-weighting noise.

Usage

For full implementation, training scripts, and inference notebooks, visit the GitHub Repository.

Downloads last month: -; Downloads are not tracked for this model. How to track