Evolla-10B

A frontier protein-language generative model โ€” because proteins deserve better small talk.

Live Demo Paper on bioRxiv GitHub Repository Post on X

Model Description

Evolla is an advanced 80-billion-parameter (with 10B variants) protein-language generative model designed to decode the molecular language of proteins. It integrates information from protein sequences, structures, and user queries to generate precise and contextually nuanced insights into protein function.

This specific repository contains the 10B parameter model, trained with Causal Protein-Language Modeling (CPLM) in its original custom format.

Note: This set of model parameters is designed to be used with our original GitHub repository. If you want to use Evolla directly with the standard ๐Ÿค— Transformers library, please check out the Official Evolla Documentation in Transformers and use Evolla-10B-hf or Evolla-10B-DPO-hf.

Usage with Original Repository

To use this model, you should clone our official repository and set up the environment as follows:

1. Clone the repository and install dependencies:

git clone https://github.com/westlake-repl/Evolla.git
cd Evolla
conda create -n Evolla python=3.10
conda activate Evolla
bash environment.sh

2. Download the model weights:

mkdir -p ckpt/huggingface
cd ckpt/huggingface
git lfs install
git clone https://huggingface.co/westlake-repl/Evolla-10B
git clone https://huggingface.co/westlake-repl/Evolla-10B-DPO
git clone https://huggingface.co/westlake-repl/SaProt_650M_AF2
git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
cd ../..

3. Run Inference: You can run the model using the provided inference script. Prepare your inputs in a TSV file (e.g., examples/inputs.tsv) and run:

python scripts/inference.py \
    --config_path config/Evolla_10B.yaml \
    --input_path examples/inputs.tsv

For more detailed instructions, please refer to the Evolla GitHub Repository.

Citation

If you find Evolla useful in your research, please cite our paper:

@article{zhou2025decoding,
  title={Decoding the molecular language of proteins with evolla},
  author={Zhou, Xibin and Han, Chenchen and Zhang, Yingqi and Du, Huan and Tian, Jiayuan and Su, Jin and Liu, Renju and Zhuang, Kai and Jiang, Shiyu and Gitter, Anthony and others},
  journal={bioRxiv},
  pages={2025--01},
  year={2025},
  publisher={Cold Spring Harbor Laboratory}
}
Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for westlake-repl/Evolla-10B

Finetuned
(2570)
this model