zihaojing commited on
Commit
67630f7
·
verified ·
1 Parent(s): 683a296

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -15,7 +15,7 @@ base_model: unsloth/Llama-3.1-8B-Instruct
15
 
16
  # EDT-Former: Full Model (Stage 2)
17
 
18
- The full **EDT-Former** model (DQ-Former encoder + Llama-3.1-8B-Instruct), as described in the ICLR 2026 paper:
19
 
20
  > **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding**
21
  > Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
@@ -23,7 +23,7 @@ The full **EDT-Former** model (DQ-Former encoder + Llama-3.1-8B-Instruct), as de
23
 
24
  ## Model Description
25
 
26
- EDT-Former aligns molecular graphs with a frozen LLM backbone (Llama-3.1-8B-Instruct) via a DQ-Former connector. Key properties:
27
 
28
  - **No LLM backbone fine-tuning** (only the embedding layer and connector are trained) — computationally efficient
29
  - **Entropy-guided dynamic token selection** preserves both local (substructural) and global molecular features
@@ -45,9 +45,9 @@ cp env.sh local.env.sh
45
  # Edit local.env.sh: set BASE_DIR, DATA_DIR, CHECKPOINT_DIR
46
  source local.env.sh
47
 
48
- # 3. Download model
49
  from huggingface_hub import snapshot_download
50
- snapshot_download("zihaojing/DQFormer-model", local_dir="checkpoints/edt_former_s2_large/final_model")
51
 
52
  # 4. Run inference (example: forward reaction prediction)
53
  bash scripts/qa/mol_forward.sh
@@ -70,8 +70,8 @@ All evaluation scripts are in `scripts/qa/`. Example tasks:
70
  | Setting | Value |
71
  |---------|-------|
72
  | LLM backbone | Llama-3.1-8B-Instruct (frozen) |
73
- | Stage 1 encoder | [zihaojing/DQFormer-encoder](https://huggingface.co/zihaojing/DQFormer-encoder) |
74
- | Training data | [zihaojing/DQFormer-sft-data](https://huggingface.co/datasets/zihaojing/DQFormer-sft-data) |
75
  | Epochs | 2 |
76
  | Learning rate | 1e-4 (cosine) |
77
  | Batch size | 4 × 8 grad accum = effective 32 |
@@ -81,9 +81,9 @@ All evaluation scripts are in `scripts/qa/`. Example tasks:
81
 
82
  | Resource | Link |
83
  |----------|------|
84
- | Pretrain Data | [zihaojing/DQFormer-pretrain-data](https://huggingface.co/datasets/zihaojing/DQFormer-pretrain-data) |
85
- | SFT Data | [zihaojing/DQFormer-sft-data](https://huggingface.co/datasets/zihaojing/DQFormer-sft-data) |
86
- | Encoder (Stage 1) | [zihaojing/DQFormer-encoder](https://huggingface.co/zihaojing/DQFormer-encoder) |
87
  | Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) |
88
 
89
  ## Citation
 
15
 
16
  # EDT-Former: Full Model (Stage 2)
17
 
18
+ The full **EDT-Former** model (encoder + Llama-3.1-8B-Instruct), as described in the ICLR 2026 paper:
19
 
20
  > **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding**
21
  > Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
 
23
 
24
  ## Model Description
25
 
26
+ EDT-Former aligns molecular graphs with a frozen LLM backbone (Llama-3.1-8B-Instruct) via an entropy-guided dynamic token connector. Key properties:
27
 
28
  - **No LLM backbone fine-tuning** (only the embedding layer and connector are trained) — computationally efficient
29
  - **Entropy-guided dynamic token selection** preserves both local (substructural) and global molecular features
 
45
  # Edit local.env.sh: set BASE_DIR, DATA_DIR, CHECKPOINT_DIR
46
  source local.env.sh
47
 
48
+ # 3. Download the model
49
  from huggingface_hub import snapshot_download
50
+ snapshot_download("zihaojing/EDT-Former-model", local_dir="checkpoints/edt_former_s2_large/final_model")
51
 
52
  # 4. Run inference (example: forward reaction prediction)
53
  bash scripts/qa/mol_forward.sh
 
70
  | Setting | Value |
71
  |---------|-------|
72
  | LLM backbone | Llama-3.1-8B-Instruct (frozen) |
73
+ | Stage 1 encoder | [zihaojing/EDT-Former-encoder](https://huggingface.co/zihaojing/EDT-Former-encoder) |
74
+ | Training data | [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) |
75
  | Epochs | 2 |
76
  | Learning rate | 1e-4 (cosine) |
77
  | Batch size | 4 × 8 grad accum = effective 32 |
 
81
 
82
  | Resource | Link |
83
  |----------|------|
84
+ | Pretrain Data | [zihaojing/EDT-Former-pretrain-data](https://huggingface.co/datasets/zihaojing/EDT-Former-pretrain-data) |
85
+ | SFT Data | [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) |
86
+ | Encoder (Stage 1) | [zihaojing/EDT-Former-encoder](https://huggingface.co/zihaojing/EDT-Former-encoder) |
87
  | Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) |
88
 
89
  ## Citation