yncui commited on
Commit
9281c8c
·
verified ·
1 Parent(s): e78458b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +126 -3
README.md CHANGED
@@ -1,3 +1,126 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # A Prompt-based Knowledge Graph Foundation Model for Universal In-Context Reasoning
3
+ <div align="center">
4
+ <img src="Logo.jpg" width="200">
5
+ </div>
6
+
7
+ > Extensive knowledge graphs (KGs) have been constructed to facilitate knowledge-driven tasks across various scenarios. However, existing work usually develops separate reasoning models for different KGs, lacking the ability to generalize and transfer knowledge across diverse KGs and reasoning settings. In this paper, we propose a prompt-based KG foundation model via in-context learning, namely **KG-ICL**, to achieve a universal reasoning ability. Specifically, we introduce a prompt graph centered with a query-related example fact as context to understand the query relation. To encode prompt graphs with the generalization ability to unseen entities and relations in queries, we first propose a unified tokenizer that maps entities and relations in prompt graphs to predefined tokens. Then, we propose two message passing neural networks to perform prompt encoding and KG reasoning, respectively. We conduct evaluation on 43 different KGs in both transductive and inductive settings. Results indicate that the proposed KG-ICL outperforms baselines on most datasets, showcasing its outstanding generalization and universal reasoning capabilities.
8
+
9
+ ![image](overview.png)
10
+
11
+ # Updates
12
+
13
+ - **2024-10-24**: CPU version: Now, KG-ICL can be run on CPU devices without the need for a GPU.
14
+ - **2024-10-14**: We have released the code and datasets for KG-ICL.
15
+ - **2024-09-26**: Our paper has been accepted by NeurIPS 2024!
16
+
17
+ # Instructions
18
+
19
+ A quick instruction is given for readers to reproduce the whole process.
20
+ Please use python 3.9 to run the code.
21
+
22
+
23
+ ## Install dependencies
24
+
25
+ pip install torch==2.2.0 --index-url https://download.pytorch.org/whl/cu118
26
+ pip install torch-scatter==2.1.2 torch-sparse==0.6.18 torch-geometric==2.4.0 -f https://data.pyg.org/whl/torch-2.2.0+cu118.html
27
+ pip install ninja easydict pyyaml tqdm
28
+
29
+ If the numpy version is not compatible, please install the following version:
30
+
31
+ pip uninstall numpy
32
+ pip install numpy==1.24.0
33
+
34
+ We use the ``rspmm`` kernel. Please make sure your ``CUDA_HOME`` variable is set properly to avoid potential compilation errors, eg
35
+
36
+ export CUDA_HOME=/usr/local/cuda-11.8/
37
+
38
+ If you do not have GPUs, or the ``rspmm`` kernel is not compiled successfully, please set the hyperparameter ``use_rspmm`` to ``False``.
39
+
40
+
41
+ ## Dataset
42
+
43
+ Release datasets
44
+
45
+ unzip datasets.zip
46
+
47
+ Process datasets
48
+
49
+ cd datasets
50
+ chmod +x process.sh
51
+ ./process.sh
52
+
53
+ ## Quick Start
54
+
55
+ > If you have any difficulty or question in running code and reproducing experimental results, please email to yncui.nju@gmail.com.
56
+
57
+ For pre-training
58
+
59
+ cd src
60
+ python pretrain.py
61
+
62
+ The checkpoints will be stored in the ``./chechpoint/pretrain/`` fold.
63
+
64
+ For evaluation, please replace the checkpoint path and test dataset path in the following shell script:
65
+
66
+ cd shell
67
+ chmod +x test.sh
68
+ ./test.sh
69
+
70
+ If you want to inference for a specific dataset, please replace the checkpoint path and evaluation dataset path:
71
+
72
+ cd src
73
+ python evaluation.py --checkpoint_path ./checkpoint/pretrain/kg_icl_6l --test_dataset_list [dataset_name]
74
+
75
+ For fine-tuning, please replace the checkpoint path and fine-tune dataset path in the following shell script:
76
+
77
+ cd shell
78
+ chmod +x finetune.sh
79
+ ./finetune.sh
80
+
81
+
82
+
83
+ ## Results
84
+ We have released three versions of the model, including KG-ICL-4L, KG-ICL-5L, KG-ICL-6L.
85
+ They are training with 4, 5, 6 layers of KG encoder, respectively.
86
+ If your device has limited memory, you can choose the KG-ICL-4L model.
87
+ If you have enough memory, you can choose the KG-ICL-6L model.
88
+ Here are the results of the three models:
89
+
90
+ ### MRR:
91
+
92
+ | Model | Inductive | Fully-Inductive | Transductive | Average |
93
+ | --- | --- | --- | --- | --- |
94
+ | Supervised SOTA | 0.466 | 0.210 | 0.365 | 0.351 |
95
+ | ULTRA (pretrain) | 0.513 | 0.352 | 0.329 | 0.396 |
96
+ | ULTRA (finetune) | 0.528 | 0.350 | 0.384 | 0.421 |
97
+ | KG-ICL-4L (pretrain) | 0.550 | 0.434 | 0.328 | 0.433 |
98
+ | KG-ICL-5L (pretrain) | 0.554 | 0.438 | 0.346 | 0.441 |
99
+ | KG-ICL-6L (pretrain) | 0.550 | 0.442 | 0.350 | 0.443 |
100
+ | KG-ICL-6L (finetune) | 0.592 | 0.444 | 0.413 | 0.481 |
101
+
102
+ ### Hits@10:
103
+
104
+ | Model | Inductive | Fully-Inductive | Transductive | Average |
105
+ | --- | --- | --- | --- | --- |
106
+ | Supervised SOTA | 0.607 | 0.347 | 0.511 | 0.493 |
107
+ | ULTRA (pretrain) | 0.664 | 0.536 | 0.479 | 0.557 |
108
+ | ULTRA (finetune) | 0.684 | 0.542 | 0.548 | 0.590 |
109
+ | KG-ICL-4L (pretrain) | 0.696 | 0.622 | 0.471 | 0.590 |
110
+ | KG-ICL-5L (pretrain) | 0.705 | 0.635 | 0.501 | 0.608 |
111
+ | KG-ICL-6L (pretrain) | 0.706 | 0.642 | 0.504 | 0.611 |
112
+ | KG-ICL-6L (finetune) | 0.738 | 0.640 | 0.566 | 0.644 |
113
+
114
+
115
+ ## Citation
116
+ If you find the repository helpful, please cite the following [paper](http://arxiv.org/abs/2410.12288)
117
+ ```bibtex
118
+ @inproceedings{cui2024prompt,
119
+ title = { A Prompt-based Knowledge Graph Foundation Model for Universal In-Context Reasoning },
120
+ author = { Cui, Yuanning and
121
+ Sun, Zequn and
122
+ Hu, Wei },
123
+ booktitle = { NeurIPS },
124
+ year = { 2024 }
125
+ }
126
+