| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - molecules |
| - chemistry |
| - molecular-understanding |
| - graph-llm |
| - llama |
| - instruction-tuning |
| pipeline_tag: text-generation |
| base_model: unsloth/Llama-3.1-8B-Instruct |
| --- |
| |
| # EDT-Former: Full Model (Stage 2) |
|
|
| The full **EDT-Former** model (encoder + Llama-3.1-8B-Instruct), as described in the ICLR 2026 paper: |
|
|
| > **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding** |
| > Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu |
| > *ICLR 2026* · [Paper](https://www.arxiv.org/abs/2602.02742) · [Code](https://github.com/selmiss/DQ-Former) |
|
|
| ## Model Description |
|
|
| EDT-Former aligns molecular graphs with a frozen LLM backbone (Llama-3.1-8B-Instruct) via an entropy-guided dynamic token connector. Key properties: |
|
|
| - **No LLM backbone fine-tuning** (only the embedding layer and connector are trained) — computationally efficient |
| - **Entropy-guided dynamic token selection** preserves both local (substructural) and global molecular features |
| - **State-of-the-art** on MoleculeQA, Mol-Instructions (forward reaction, retrosynthesis, reagent prediction, mol design, open QA), TDC, and MoleculeNet benchmarks |
|
|
| This Stage 2 checkpoint (~16 GB) is the final instruction-tuned model ready for downstream molecular QA tasks. |
|
|
| ## Usage |
|
|
| ```python |
| # 1. Clone the repo and set up the environment |
| git clone https://github.com/selmiss/DQ-Former.git |
| cd DQ-Former |
| conda env create -f environment.yml |
| conda activate edtformer |
| |
| # 2. Configure paths in local.env.sh |
| cp env.sh local.env.sh |
| # Edit local.env.sh: set BASE_DIR, DATA_DIR, CHECKPOINT_DIR |
| source local.env.sh |
| |
| # 3. Download the model |
| from huggingface_hub import snapshot_download |
| snapshot_download("zihaojing/EDT-Former-model", local_dir="checkpoints/edt_former_s2_large/final_model") |
| |
| # 4. Run inference (example: forward reaction prediction) |
| bash scripts/qa/mol_forward.sh |
| ``` |
|
|
| ## Downstream Task Scripts |
|
|
| All evaluation scripts are in `scripts/qa/`. Example tasks: |
|
|
| | Task | Script | |
| |------|--------| |
| | Forward Reaction Prediction | `scripts/qa/mol_forward.sh` | |
| | Retrosynthesis | `scripts/qa/retrosynthesis.sh` | |
| | Reagent Prediction | `scripts/qa/reagent_prediction.sh` | |
| | Molecule Design | `scripts/qa/mol_design.sh` | |
| | Open-ended QA | `scripts/qa/open_question.sh` | |
|
|
| ## Training Details |
|
|
| | Setting | Value | |
| |---------|-------| |
| | LLM backbone | Llama-3.1-8B-Instruct (frozen) | |
| | Stage 1 encoder | [zihaojing/EDT-Former-encoder](https://huggingface.co/zihaojing/EDT-Former-encoder) | |
| | Training data | [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) | |
| | Epochs | 2 | |
| | Learning rate | 1e-4 (cosine) | |
| | Batch size | 4 × 8 grad accum = effective 32 | |
| | Precision | BF16 | |
|
|
| ## Related Resources |
|
|
| | Resource | Link | |
| |----------|------| |
| | Pretrain Data | [zihaojing/EDT-Former-pretrain-data](https://huggingface.co/datasets/zihaojing/EDT-Former-pretrain-data) | |
| | SFT Data | [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) | |
| | Encoder (Stage 1) | [zihaojing/EDT-Former-encoder](https://huggingface.co/zihaojing/EDT-Former-encoder) | |
| | Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) | |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{jing2026edtformer, |
| title={Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding}, |
| author={Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Sun, Yan and Wang, Boyu and Hu, Pingzhao}, |
| booktitle={International Conference on Learning Representations (ICLR)}, |
| year={2026} |
| } |
| ``` |
|
|