[综述笔记]A Survey of Graph Meets Large Language Model: Progress and Future Directions-优快云博客

where $h_i^{(l)}$ denotes feature vector of node $i$ in the $l$ -th layer, $\mathcal{N}_i$ is the neighbors of node $i$ , $\mathbf{M}$ is message passing function, $\mathbf{U}$ is the updating function

（2）Graph pre-training and prompting

①Limitation of GNN: annotations needing and weak generalization, slightly resolved by graph pretraining

②Categorize graph methods to contrastive and generative way, and list models respectively

pertain vi.适用；存在

2.3.2. Large Language Models

（1）Definitions

①Difference between LLMs and pre-trained language models (PLMs): LLMs: huge language models (i.e., billion-level) that undergo pre-training on a significant amount of data; PLMs: early pre-trained models with moderate parameter sizes (i.e., million-level), which can be easily further fine-tuned on task-specific data to achieve better results to downstream tasks

（2）Evolution

①Categories of LLMs: non-autoregressive and autoregressive

②Non-autoregressive LLMs: focus on natural language understanding and take masked language modeling as pretraining task

③Autoregressive LLMs: take next token prediction as task

2.3.3.Proposed Taxonomy

①Introducing their fig.2 (my second graph)

2.4. LLM as Enhancer

①The 2 methods of LLM as enhancer:

2.4.1. Explanation-based Enhancement

①When LLMs are used for enhancing explanation, they can be explanations, knowledge entities, and pseudo labels

②Typical pipeline:

where $t_i$ denotes the text attributes, $p$ denotes the designed textual prompts, $e_i$ denotes the additional textural output, $\mathbf{x}_i \in \mathbb{R}^{D}$ and $\mathbf{X} \in \mathbb{R}^{N \times D}$ is the enhanced node embedding of node $i$ , $D$ denotes dimension, $\mathbf{A}\in \mathbb{R}^{N \times N}$ is adjacency matrix, $\mathbf{H} \in \mathbb{R}^{N \times d}$ is the node attributes aggregated by GNN, $d$ is the aggregated dimension

2.4.2. Embedding-based Enhancement

①This methods directly generate node attributes for graph:

⭐this method requires embedding-visible or open-source LLMs cuz fine-tune needed

2.4.3. Discussions

①The performance may be superior bur the cost, emm... still needs to be optimized

2.5. LLM as Predictor

①The category is based on whether employing GNNs to extract structural features for LLMs

②Graph information cannot be directly output as series

③Two categories:

2.5.1. Flatten-based Prediction

①This method transforms graph to sequence of nodes or tokens $G_{seq}$ and parses them:

where $\mathcal{V}$ is node set, $\mathcal{E}$ is edge set, $\mathcal{T}$ denotes node text attributes set, $\mathcal{J}$ is edge text attributes, $p$ represents the instruction prompt, $\tilde{Y}$ is the redicted label

parse v.对（句子或句子中的词）作语法分析，作句法分析；<计>对…进行语法分析；语法（或句法）上可成立 n.<计>语法分析，语法分析结果

②They further categories this method to frozen and tuning

2.5.2. GNN-based Prediction

①The method of combination of GNN and LLM:

2.5.3. Discussions

①Advantages of LLMs in GNN: increase the performance on zero-shot prediction

②Problems of flatten: cannot capture long range information by limit hops

2.6. GNN-LLM Alignment

①Alignment achieves the fusion of GNN and LLM

②Visualization of this method:

2.6.1. Symmetric

①Approach: encoding graph and text respectively

②Limitation: lack of interaction between text and graph

③Solution: some researchers employ contrastive learning for alignment:

$\ell(\mathbf{g}_{i},\mathbf{t}_{i})=-\log\frac{e^{s(\mathbf{g}_{i},\mathbf{t}_{i})/\tau}}{\sum_{k=1}^{|\mathcal{G}|}e^{s(\mathbf{g}_{i},\mathbf{t}_{k})/\tau}},\\\mathcal{L}_{\mathrm{InfoNCE}}=\frac{1}{2|\mathcal{G}|}\sum_{i=1}^{|\mathcal{G}|}\Big(\ell(\mathbf{g}_{i},\mathbf{t}_{i})+\ell(\mathbf{t}_{i},\mathbf{g}_{i})\Big),$

where $\mathbf{g}$ is the representation of a graph with its corresponding text $\mathbf{t}$ , $s\left ( \cdot ,\cdot \right )$ denotes the score function that assigns high values to the positive pair, and low values to negative pairs, $\tau$ denotes temperature parameter, $\left | \mathcal{G} \right |$ is the number of graphs for training

④Another way: as (b), two models iterate interactively

2.6.2. Asymmetric

①Definition: one method in it is to assist another

②Two approaches: graph-nested transformer and graph aware distillation

2.6.3. Discussions

①Data scarsity might heavily influence alignment technique

2.7. Future Directions

①Summarized models:

②Dealing with non-TAG: some nodes lack of text attributes and are hard to describe

③Dealing with data leakage: LLMs might have learnt some... data...before...6

④Improving transferability: the heterogeneity of graph attributes makes it more difficult for models to be transferred

⑤Improving expainability: further explorations needing

⑥Improving efficiency: time and space cost

⑦Analysis and improvement of expressive ability: graph structure is hard to understand, namely, ut is difficult for LLMs to distinguish between isomorphic and heterogeneous graphs

⑧LLMs as agent: LLMs can be regarded as agent in different fields

2.8. Conclusion

3. 知识补充

3.1. Frozen and tuning in LLMs

（1）Frozen（冻结）：这指的是模型的参数在训练后不再更新或调整。在这种状态下，模型被视为固定的，只用于推理或生成文本，而不会再进行进一步的训练。冻结模型通常用来确保其原始性能不受后续操作的影响。

（2）Tuning（微调）：这指的是对已训练模型进行进一步的训练，以适应特定任务或数据集。微调通常涉及调整模型的一部分参数，以提高其在特定应用场景中的表现。例如，可以在一个大型通用模型上微调，以使其更好地理解某一领域的语言或风格。

3.2. Superpixel graph

（1）Example:

（3）来自外行的我的评价：这什么玩意儿？极大目标分类？？？有必要吗

3.3. Agent

（1）定义：在深度学习中，"agent" 通常指的是在某个环境中进行决策和行动的智能体。它可以通过与环境互动来学习最佳策略，以达到特定目标。

（2）例子：例如，在强化学习中，一个游戏中的角色可以被视为一个 agent。它通过观察游戏状态（如位置、敌人等），采取行动（如移动、攻击），并根据获得的奖励（如得分或生命值变化）来学习如何更好地完成游戏目标。这样的 agent 在不断尝试和错误中优化其策略，从而提高表现。

4. Reference

Li, Y. et al. (2024) 'A Survey of Graph Meets Large Language Model: Progress and Future Directions', IJCAI survey track, pp. 8123-8131. doi: A Survey of Graph Meets Large Language Model: Progress and Future Directions | IJCAI