Incorporating Word Correlation Knowledge into Topic Modeling

本文探讨MRF-LDA模型,一种结合马尔科夫随机场(MRF)与潜在狄利克雷分配(LDA)的方法,用于提升话题模型中单词间语义一致性的识别。模型假设话题与单词分布遵循狄利克雷分布,通过MRF增强话题一致性,利用一元与二元势能优化话题分配。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

开始没有仔细看的时候,也不懂MRF的应用,学习细节才知道重点。

论文

简单描述MRF-LDA的场景:首先,假设话题的粉笔θ,以及单词分布β都服从狄利克雷分布,每个单词具有话题标签z。提取单词的关联关系,主要是语义的相似性,用来学习话题的一致性。所以用到了MRF,在潜在话题层。给定文档d和N单词,判断单词对(根据外部知识)创建无向边连接他们的话题标签,获取图G和单词标签。如下图的5个节点,4条边(z1, z3), (z2, z5), (z3, z4), (z3, z5).,

有了图,然后利用MRF,即定义节点上的一元势能和边上的二元势能。其中一元势能即p(zi|θ),话题多项分布;二元势能鼓励相似单词具有相似(相同)话题分配。

IncorporatingWord <wbr>Correlation <wbr>Knowledge <wbr>into <wbr>Topic <wbr>Modeling
在MRF模型,所有话题分配的联合概率为:
IncorporatingWord <wbr>Correlation <wbr>Knowledge <wbr>into <wbr>Topic <wbr>Modeling

变分推断的参数学习,略IncorporatingWord <wbr>Correlation <wbr>Knowledge <wbr>into <wbr>Topic <wbr>Modeling

### Knowledge Graph Link Prediction Frameworks for Retrieval and Reading Knowledge graph link prediction involves predicting missing links or relationships between entities within a knowledge graph. For this purpose, several frameworks have been developed to enhance both retrieval and reading capabilities. #### 1. TransE Model TransE is one of the foundational models used in knowledge graph embedding methods. It represents each entity and relation as vectors in low-dimensional space where the score function measures how well a triplet $(h,r,t)$ holds by computing $f_r(h,t)=||\mathbf{e}_h+\mathbf{r}-\mathbf{e}_t||_2$[^1]. This model assumes that relations can be translated from head entities to tail entities through vector addition operations. #### 2. Convolutional Knowledge Graph Embeddings (ConvE) ConvE extends traditional embeddings like TransE by incorporating convolution layers over subject-predicate pairs before applying fully connected layers followed by reshaping into two-dimensional matrices. The scoring mechanism uses these transformed representations alongside object embeddings via element-wise multiplication and subsequent summation across all dimensions[^2]. ```python import torch.nn.functional as F from torch import nn class ConvE(nn.Module): def __init__(self, num_entities, num_relations, embedding_dim=200): super(ConvE, self).__init__() self.entity_embeddings = nn.Embedding(num_entities, embedding_dim) self.relation_embeddings = nn.Embedding(num_relations, embedding_dim) def forward(self, sub, rel): e_s = self.entity_embeddings(sub).view(-1, 1, 10, 20) # Example reshape r = self.relation_embeddings(rel).view(-1, 1, 10, 20) # Example reshape stacked_inputs = torch.cat([e_s, r], dim=-1) x = F.relu(stacked_inputs) return x ``` #### 3. ComplEx Model ComplEx addresses limitations found in earlier approaches such as inability to handle symmetric/antisymmetric properties effectively. By introducing complex-valued embeddings instead of real numbers only, it allows modeling asymmetric interactions more accurately while maintaining computational efficiency similar to simpler models[^3]. For applications requiring advanced reasoning tasks beyond simple triplets, architectures combining neural networks with symbolic logic offer promising solutions not covered directly here but worth exploring further based on specific requirements.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值