zihaojing commited on
Commit
17528a1
·
verified ·
1 Parent(s): ff2cc79

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +98 -0
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - molecules
7
+ - chemistry
8
+ - molecular-understanding
9
+ - graph-llm
10
+ - llama
11
+ - instruction-tuning
12
+ pipeline_tag: text-generation
13
+ base_model: unsloth/Llama-3.1-8B-Instruct
14
+ ---
15
+
16
+ # EDT-Former: Full Model (Stage 2)
17
+
18
+ The full **EDT-Former** model (DQ-Former encoder + Llama-3.1-8B-Instruct), as described in the ICLR 2026 paper:
19
+
20
+ > **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding**
21
+ > Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
22
+ > *ICLR 2026* · [Paper](https://www.arxiv.org/abs/2602.02742) · [Code](https://github.com/selmiss/DQ-Former)
23
+
24
+ ## Model Description
25
+
26
+ EDT-Former aligns molecular graphs with a frozen LLM backbone (Llama-3.1-8B-Instruct) via a DQ-Former connector. Key properties:
27
+
28
+ - **No LLM backbone fine-tuning** (only the embedding layer and connector are trained) — computationally efficient
29
+ - **Entropy-guided dynamic token selection** preserves both local (substructural) and global molecular features
30
+ - **State-of-the-art** on MoleculeQA, Mol-Instructions (forward reaction, retrosynthesis, reagent prediction, mol design, open QA), TDC, and MoleculeNet benchmarks
31
+
32
+ This Stage 2 checkpoint (~16 GB) is the final instruction-tuned model ready for downstream molecular QA tasks.
33
+
34
+ ## Usage
35
+
36
+ ```python
37
+ # 1. Clone the repo and set up the environment
38
+ git clone https://github.com/selmiss/DQ-Former.git
39
+ cd DQ-Former
40
+ conda env create -f environment.yml
41
+ conda activate edtformer
42
+
43
+ # 2. Configure paths in local.env.sh
44
+ cp env.sh local.env.sh
45
+ # Edit local.env.sh: set BASE_DIR, DATA_DIR, CHECKPOINT_DIR
46
+ source local.env.sh
47
+
48
+ # 3. Download model
49
+ from huggingface_hub import snapshot_download
50
+ snapshot_download("zihaojing/DQFormer-model", local_dir="checkpoints/edt_former_s2_large/final_model")
51
+
52
+ # 4. Run inference (example: forward reaction prediction)
53
+ bash scripts/qa/mol_forward.sh
54
+ ```
55
+
56
+ ## Downstream Task Scripts
57
+
58
+ All evaluation scripts are in `scripts/qa/`. Example tasks:
59
+
60
+ | Task | Script |
61
+ |------|--------|
62
+ | Forward Reaction Prediction | `scripts/qa/mol_forward.sh` |
63
+ | Retrosynthesis | `scripts/qa/retrosynthesis.sh` |
64
+ | Reagent Prediction | `scripts/qa/reagent_prediction.sh` |
65
+ | Molecule Design | `scripts/qa/mol_design.sh` |
66
+ | Open-ended QA | `scripts/qa/open_question.sh` |
67
+
68
+ ## Training Details
69
+
70
+ | Setting | Value |
71
+ |---------|-------|
72
+ | LLM backbone | Llama-3.1-8B-Instruct (frozen) |
73
+ | Stage 1 encoder | [zihaojing/DQFormer-encoder](https://huggingface.co/zihaojing/DQFormer-encoder) |
74
+ | Training data | [zihaojing/DQFormer-sft-data](https://huggingface.co/datasets/zihaojing/DQFormer-sft-data) |
75
+ | Epochs | 2 |
76
+ | Learning rate | 1e-4 (cosine) |
77
+ | Batch size | 4 × 8 grad accum = effective 32 |
78
+ | Precision | BF16 |
79
+
80
+ ## Related Resources
81
+
82
+ | Resource | Link |
83
+ |----------|------|
84
+ | Pretrain Data | [zihaojing/DQFormer-pretrain-data](https://huggingface.co/datasets/zihaojing/DQFormer-pretrain-data) |
85
+ | SFT Data | [zihaojing/DQFormer-sft-data](https://huggingface.co/datasets/zihaojing/DQFormer-sft-data) |
86
+ | Encoder (Stage 1) | [zihaojing/DQFormer-encoder](https://huggingface.co/zihaojing/DQFormer-encoder) |
87
+ | Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) |
88
+
89
+ ## Citation
90
+
91
+ ```bibtex
92
+ @inproceedings{jing2026edtformer,
93
+ title={Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding},
94
+ author={Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Sun, Yan and Wang, Boyu and Hu, Pingzhao},
95
+ booktitle={International Conference on Learning Representations (ICLR)},
96
+ year={2026}
97
+ }
98
+ ```