File size: 2,517 Bytes
7dff722
 
 
 
 
 
 
 
 
 
 
 
 
7c834c5
7dff722
7c834c5
7dff722
 
 
 
 
 
 
7c834c5
7dff722
 
7c834c5
7dff722
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7c834c5
7dff722
 
7c834c5
7dff722
 
7c834c5
7dff722
7c834c5
7dff722
 
 
 
 
 
 
 
 
 
 
 
 
7c834c5
 
 
7dff722
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
license: mit
language:
- en
tags:
- molecules
- chemistry
- graph-encoder
- qformer
- molecular-understanding
pipeline_tag: feature-extraction
---

# EDT-Former Encoder (Stage 1)

The pretrained **EDT-Former encoder** from the ICLR 2026 paper:

> **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding**
> Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
> *ICLR 2026* · [Paper](https://www.arxiv.org/abs/2602.02742) · [Code](https://github.com/selmiss/DQ-Former)

## Model Description

The EDT-Former encoder is a Dual Q-Former that bridges molecular graphs and language. It uses:
- **Entropy-guided dynamic token selection** to focus on informative molecular patches
- **BRICS fragment IDs** for substructural awareness
- **Cross-attention over graph node features** to generate a token sequence aligned with text

This Stage 1 checkpoint (~699 MB) is trained on the PubChem pretraining corpus and is used to initialize Stage 2 (full model) training.

**Architecture config:**
- `num_query_tokens`: 32
- `embed_dim`: 512
- `cross_attention_freq`: 1
- `num_layers`: 8 (blending module)
- `num_heads`: 8

## Usage

Use this checkpoint as the Stage 1 initialization for Stage 2 fine-tuning:

```yaml
# configs/stage2_dqw2d/model_config.yaml
stage1_path: path/to/EDT-Former-encoder/model.safetensors
```

Or download and use directly:

```python
from huggingface_hub import snapshot_download

snapshot_download("zihaojing/EDT-Former-encoder", local_dir="checkpoints/edt_former_s1_large/final_model")
```

To reproduce Stage 1 training from scratch:

```bash
# Set up environment first (see repo README)
bash scripts/training/pretraining.sh
```

## Related Resources

| Resource | Link |
|----------|------|
| Pretrain Data | [zihaojing/EDT-Former-pretrain-data](https://huggingface.co/datasets/zihaojing/EDT-Former-pretrain-data) |
| SFT Data | [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) |
| Full Model (Stage 2) | [zihaojing/EDT-Former-model](https://huggingface.co/zihaojing/EDT-Former-model) |
| Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) |

## Citation

```bibtex
@inproceedings{jing2026edtformer,
  title={Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding},
  author={Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Sun, Yan and Wang, Boyu and Hu, Pingzhao},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}
```