| --- |
| library_name: tedbench |
| tags: |
| - protein |
| - structure-sequence |
| - fold-classification |
| - tedbench |
| - saprot |
| pipeline_tag: other |
| license: bsd-3-clause |
| --- |
| |
| # TEDBench — SaProt-650M fine-tuned on TEDBench |
|
|
| Backbone: SaProt-650M (33 layers, hidden dim 1280). Requires [Foldseek](https://github.com/steineggerlab/foldseek) for structure-aware tokens. |
|
|
| Fine-tuned on [TEDBench](https://github.com/BorgwardtLab/TEDBench) for protein |
| fold classification into 965 CATH topology (T-level) classes (ICML 2026). |
|
|
| ## Usage |
|
|
| ```python |
| import sys |
| sys.path.insert(0, "baselines") # from repo root |
| |
| from pathlib import Path |
| import torch |
| from models.saprot_classifier import SaProtClassifier |
| from omegaconf import OmegaConf |
| from huggingface_hub import snapshot_download |
| |
| local_dir = Path(snapshot_download("TEDBench/saprot-650M-ft")) |
| with open(local_dir / "config.json") as f: |
| import json |
| cfg = OmegaConf.create(json.load(f)) |
| |
| model = SaProtClassifier(cfg) |
| sd = torch.load(local_dir / "pytorch_model.bin", map_location="cpu", weights_only=False) |
| model.load_state_dict(sd) |
| model.eval() |
| ``` |
|
|
| Or pass the repo ID directly to the test script: |
|
|
| ```bash |
| python baselines/saprot_test_ted.py train.ckpt_path=TEDBench/saprot-650M-ft |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{chen2026tedbench, |
| title={Protein Fold Classification at Scale: Benchmarking and Pretraining}, |
| author={Chen, Dexiong and Manolache, Andrei and Niepert, Mathias and Borgwardt, Karsten}, |
| booktitle={Proceedings of the 43rd International Conference on Machine Learning}, |
| year={2026} |
| } |
| ``` |
|
|