Upload README.md with huggingface_hub

67630f7 verified 1 day ago

3.57 kB

	---
	license: mit
	language:
	- en
	tags:
	- molecules
	- chemistry
	- molecular-understanding
	- graph-llm
	- llama
	- instruction-tuning
	pipeline_tag: text-generation
	base_model: unsloth/Llama-3.1-8B-Instruct
	---

	# EDT-Former: Full Model (Stage 2)

	The full EDT-Former model (encoder + Llama-3.1-8B-Instruct), as described in the ICLR 2026 paper:

	> Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding
	> Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
	> ICLR 2026 · [Paper](https://www.arxiv.org/abs/2602.02742) · [Code](https://github.com/selmiss/DQ-Former)

	## Model Description

	EDT-Former aligns molecular graphs with a frozen LLM backbone (Llama-3.1-8B-Instruct) via an entropy-guided dynamic token connector. Key properties:

	- No LLM backbone fine-tuning (only the embedding layer and connector are trained) — computationally efficient
	- Entropy-guided dynamic token selection preserves both local (substructural) and global molecular features
	- State-of-the-art on MoleculeQA, Mol-Instructions (forward reaction, retrosynthesis, reagent prediction, mol design, open QA), TDC, and MoleculeNet benchmarks

	This Stage 2 checkpoint (~16 GB) is the final instruction-tuned model ready for downstream molecular QA tasks.

	## Usage

	```python
	# 1. Clone the repo and set up the environment
	git clone https://github.com/selmiss/DQ-Former.git
	cd DQ-Former
	conda env create -f environment.yml
	conda activate edtformer

	# 2. Configure paths in local.env.sh
	cp env.sh local.env.sh
	# Edit local.env.sh: set BASE_DIR, DATA_DIR, CHECKPOINT_DIR
	source local.env.sh

	# 3. Download the model
	from huggingface_hub import snapshot_download
	snapshot_download("zihaojing/EDT-Former-model", local_dir="checkpoints/edt_former_s2_large/final_model")

	# 4. Run inference (example: forward reaction prediction)
	bash scripts/qa/mol_forward.sh
	```

	## Downstream Task Scripts

	All evaluation scripts are in `scripts/qa/`. Example tasks:

	\| Task \| Script \|
	\|------\|--------\|
	\| Forward Reaction Prediction \| `scripts/qa/mol_forward.sh` \|
	\| Retrosynthesis \| `scripts/qa/retrosynthesis.sh` \|
	\| Reagent Prediction \| `scripts/qa/reagent_prediction.sh` \|
	\| Molecule Design \| `scripts/qa/mol_design.sh` \|
	\| Open-ended QA \| `scripts/qa/open_question.sh` \|

	## Training Details

	\| Setting \| Value \|
	\|---------\|-------\|
	\| LLM backbone \| Llama-3.1-8B-Instruct (frozen) \|
	\| Stage 1 encoder \| [zihaojing/EDT-Former-encoder](https://huggingface.co/zihaojing/EDT-Former-encoder) \|
	\| Training data \| [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) \|
	\| Epochs \| 2 \|
	\| Learning rate \| 1e-4 (cosine) \|
	\| Batch size \| 4 × 8 grad accum = effective 32 \|
	\| Precision \| BF16 \|

	## Related Resources

	\| Resource \| Link \|
	\|----------\|------\|
	\| Pretrain Data \| [zihaojing/EDT-Former-pretrain-data](https://huggingface.co/datasets/zihaojing/EDT-Former-pretrain-data) \|
	\| SFT Data \| [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) \|
	\| Encoder (Stage 1) \| [zihaojing/EDT-Former-encoder](https://huggingface.co/zihaojing/EDT-Former-encoder) \|
	\| Code \| [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) \|

	## Citation

	```bibtex
	@inproceedings{jing2026edtformer,
	title={Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding},
	author={Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Sun, Yan and Wang, Boyu and Hu, Pingzhao},
	booktitle={International Conference on Learning Representations (ICLR)},
	year={2026}
	}
	```