EDT-Former: Full Model (Stage 2)

The full EDT-Former model (encoder + Llama-3.1-8B-Instruct), as described in the ICLR 2026 paper:

Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu ICLR 2026 · Paper · Code

Model Description

EDT-Former aligns molecular graphs with a frozen LLM backbone (Llama-3.1-8B-Instruct) via an entropy-guided dynamic token connector. Key properties:

  • No LLM backbone fine-tuning (only the embedding layer and connector are trained) — computationally efficient
  • Entropy-guided dynamic token selection preserves both local (substructural) and global molecular features
  • State-of-the-art on MoleculeQA, Mol-Instructions (forward reaction, retrosynthesis, reagent prediction, mol design, open QA), TDC, and MoleculeNet benchmarks

This Stage 2 checkpoint (~16 GB) is the final instruction-tuned model ready for downstream molecular QA tasks.

Usage

# 1. Clone the repo and set up the environment
git clone https://github.com/selmiss/DQ-Former.git
cd DQ-Former
conda env create -f environment.yml
conda activate edtformer

# 2. Configure paths in local.env.sh
cp env.sh local.env.sh
# Edit local.env.sh: set BASE_DIR, DATA_DIR, CHECKPOINT_DIR
source local.env.sh

# 3. Download the model
from huggingface_hub import snapshot_download
snapshot_download("zihaojing/EDT-Former-model", local_dir="checkpoints/edt_former_s2_large/final_model")

# 4. Run inference (example: forward reaction prediction)
bash scripts/qa/mol_forward.sh

Downstream Task Scripts

All evaluation scripts are in scripts/qa/. Example tasks:

Task Script
Forward Reaction Prediction scripts/qa/mol_forward.sh
Retrosynthesis scripts/qa/retrosynthesis.sh
Reagent Prediction scripts/qa/reagent_prediction.sh
Molecule Design scripts/qa/mol_design.sh
Open-ended QA scripts/qa/open_question.sh

Training Details

Setting Value
LLM backbone Llama-3.1-8B-Instruct (frozen)
Stage 1 encoder zihaojing/EDT-Former-encoder
Training data zihaojing/EDT-Former-sft-data
Epochs 2
Learning rate 1e-4 (cosine)
Batch size 4 × 8 grad accum = effective 32
Precision BF16

Related Resources

Citation

@inproceedings{jing2026edtformer,
  title={Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding},
  author={Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Sun, Yan and Wang, Boyu and Hu, Pingzhao},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
8B params
Tensor type
I64
·
BF16
·
BOOL
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zihaojing/EDT-Former-model

Finetuned
(260)
this model

Paper for zihaojing/EDT-Former-model