distMult: EMBEDDING ENTITIES AND RELATIONS FOR LEARNING AND INFERENCE IN KNOWLEDGE BASE

本文介绍了利用双线性模型进行链接预测与规则抽取的深度解析,包括模型构建(通过embedding和对角矩阵限制)、损失函数的选择(基于margin-ranking loss),并展示了链接预测的改进技术和规则抽取在知识图谱完整性的提升作用。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

   本篇论文是经典的双线性模型,通过乘法的方式进行知识图谱补全。查看原文

1 简介

该方法采用双线性模型,可以进行链接预测,常规的知识图谱补全的方法,还可以通过学习关系嵌入来挖掘逻辑规则,如 B o r n I n C i t y ( a , b ) ∧ C i t y O f C o u n t r y ( b , c ) ⇒ N a t i o n a l i t y ( a , c ) BornInCity(a, b)\wedge CityOfCountry(b, c)\Rightarrow Nationality(a, c) BornInCity(a,b)CityOfCountry(b,c)Nationality(a,c)

2 模型

2.1 Embedding

X e 1 , X e 2 X_{e_{1}},X_{e_{2}} Xe1,Xe2分别是三元组 e 1 , e 2 e_{1}, e_{2} e1,e2的one-hot编码, y e 1 = f ( W X e 1 ) \mathbf{y}_{e_{1}} = f(\mathbf{W}X_{e_{1}}) ye1=f(WXe1), y e 2 = f ( W X e 2 ) \mathbf{y}_{e_{2}} = f(\mathbf{W}X_{e_{2}}) ye2=f(WXe2),其中 W ∈ R n e × d i m \mathbf{W\in}\mathbb{R}^{n_{e}\times dim} WRne×dim为参数矩阵, n e n_{e} ne为实体的个数, d i m dim dim为数据每个数据的维度, W \mathbf{W} W可以采用随机初始化, f f f是一个非线性函数,类似于ReLU这种函数。

2.2 统一框架

一个基本的线性变换 g r a g_{r}^a gra, 双线性变换 g r a g_{r}^a gra
在这里插入图片描述

2.3 模型选择

这篇文章仅仅选择基础的双线性模型作为评分函数, g r b ( y e 1 , y e 2 ) = y e 1 T M r y e 2 g_{r}^b (\mathbf y_{e_{1}}, \mathbf y_{e_{2}})=\mathbf y_{e_{1}}^T\mathbf M_{r}\mathbf y_{e_{2}} grb(ye1,ye2)=ye1TMrye2,其中 M r ∈ R n × n \mathbf M_{r}\in\mathbb R^{n\times n} MrRn×n,但由于 M r \mathbf M_{r} Mr参数过大需要对参数进行精简,因此,对 M r \mathbf M_{r} Mr参数进行限制,限制其为对角矩阵,则其参数接近于TransE模型。

2.4 损失函数

对于损失函数,依旧选择margin-based ranking loss作为损失函数。对于正例三元组集合 T \mathit T T, 负例集合 T ′ \mathit T^\prime T, E \mathit E E为实体集合,对正例进行负采样,随机破坏三元组中 ( e 1 , r , e 2 ) (e_{1}, r, e_{2}) (e1,r,e2)中实体 e 1 e_{1} e1或者 e 2 e_{2} e2中任意一个进行破坏,其中 T ′ = { ( e 1 ′ , r , e 2 ) ∣ e 1 ′ ∈ E , ( e 1 ′ , r , e 2 ) ∉ T } ∪ { ( e 1 , r , e 2 ′ ) ∣ e 2 ′ ∈ E , ( e 1 , r , e 2 ′ ) ∉ T } \mathit T\prime=\{(e_{1}\prime, r,e_{2})|e_{1}\prime\in\mathit E,(e_{1}\prime, r,e_{2})\notin\mathit T\}\cup\{(e_{1}, r,e_{2}\prime)|e_{2}\prime\in\mathit E,(e_{1}, r,e_{2}\prime)\notin\mathit T\} T={(e1,r,e2)e1E,(e1,r,e2)/T}{(e1,r,e2)e2E,(e1,r,e2)/T}。损失函数如下:
在这里插入图片描述
其中 S ( e 1 , r , e 2 ) S_{(e_{1}, r,e_{2})} S(e1,r,e2)作为评分函数。

3 总结

3.1 推理任务一:链接预测

破坏掉三元组,对于测试数据中的每个三元组,我们将每个实体视为要依次预测的目标实体。将为字典中正确的实体和所有损坏的实体计算分数,并按降序排列。采用hit@n,mrr,mr等信息作为评估方式。
其在实现时,提出一些改善

  • 相比于TransE这种模型,引用了非线性函数 t a n h tanh tanh函数
  • 使用了预训练的方法进行embedding,通过word2vec方式

3.2 推理任务二:规则抽取

规则抽取,如 B o r n I n C i t y ( a , b ) ∧ C i t y O f C o u n t r y ( b , c ) ⇒ N a t i o n a l i t y ( a , c ) BornInCity(a, b)\wedge CityOfCountry(b, c)\Rightarrow Nationality(a, c) BornInCity(a,b)CityOfCountry(b,c)Nationality(a,c),这种逻辑规则有四个重要目的,其目的如下:

  • 首先,他们可以帮助推断新的事实,完善现有的 K B s KBs KBs
  • 其次,它们可以通过只存储规则而不是大量的扩展数据来帮助优化数据存储,并且只在推理时生成事实。
  • 第三,它们可以支持复杂的推理。
  • 最后,它们可以为推理结果提供解释,例如,我们可以推断人们的职业通常涉及他们研究的领域的专业化,等等。

传统的规则推理方法在我们知识图谱大量数据这方面,不能够很好的处理。
在这里插入图片描述
其中 B i 、 H \mathit B_{i}、\mathit H BiH代表相应的关系, a i a_{i} ai代表相应的实体。我们约束身体关系 B 1 , … , B n B1,…, Bn B1Bn在图中形成一条有向路径,并将H与一条闭合该路径的有向边联系起来。对于 B i − 1 ( a , b ) ∧ B i ( a , c ) \mathit B_{i-1}(a, b)\land \mathit B_{i}(a, c) Bi1(a,b)Bi(a,c)形式,采用 B i − 1 − 1 ( b , a ) ∧ B i ( a , c ) \mathit B_{i-1}^{-1}(b, a)\land \mathit B_{i}(a, c) Bi11(b,a)Bi(a,c)

### DeepSeek RAG Knowledge Base Implementation and Usage #### Understanding the Components of a RAG System with Text and Knowledge Embeddings A Retrieval-Augmented Generation (RAG) system combines retrieval-based methods with generative models to improve context-awareness in language processing tasks. In this setup, text embeddings provide semantic information about textual content while knowledge graph embeddings capture structured relationships between entities[^1]. For instance, when implementing a RAG model like those mentioned under projects such as AnythingLLM or MaxKB[^2], integrating both types of embeddings can significantly enhance performance by allowing the model not only to understand unstructured data but also leverage structured relational information from knowledge graphs. #### Implementing DeepSeek RAG Knowledge Base To implement a DeepSeek RAG knowledge base effectively: - **Data Preparation**: Start by preparing your dataset which includes documents that will be indexed for retrieval along with any associated metadata. - **Knowledge Graph Construction**: Construct a knowledge graph using domain-specific ontologies where nodes represent concepts/entities within the document corpus and edges denote relations among them. This step is crucial because it forms the backbone upon which more sophisticated operations are built later on. - **Embedding Models Selection**: Choose appropriate pre-trained embedding models capable of generating high-quality representations for texts and entities present inside the constructed KG. For example, BERT could serve well here due to its proven effectiveness across many NLP benchmarks[^4]. - **Index Building & Query Processing Pipeline Setup**: Utilize tools provided through frameworks similar to FlashRAG to facilitate efficient indexing over large-scale datasets alongside streamlined query handling mechanisms designed specifically around supporting complex search queries involving multiple criteria simultaneously[^3]. Here's an illustrative Python snippet demonstrating how one might set up parts of this pipeline programmatically: ```python from transformers import BertTokenizer, BertModel import torch tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertModel.from_pretrained('bert-base-uncased') def encode_text(text): inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) outputs = model(**inputs)[0][:, 0, :] return outputs.detach().numpy() encoded_texts = [encode_text(doc['content']) for doc in documents] ``` This code initializes a BERT tokenizer and model before defining `encode_text`, a function used to convert raw strings into numerical vectors suitable for downstream machine learning applications including similarity searches against stored indices during inference time. --related questions-- 1. How does constructing a detailed knowledge graph contribute to improving RAG systems' ability to generate accurate responses? 2. What specific advantages do knowledge graph embeddings offer compared to traditional text-only approaches in enhancing LLM capabilities? 3. Can you elaborate on the role played by entity linking processes within these hybrid architectures combining text and knowledge elements? 4. Are there particular challenges encountered when scaling up implementations incorporating extensive external resources like comprehensive knowledge bases?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值