指代消解_论文理解《Improving Coreference Resolution by Learning Entity-Level Distributed Representations》

论文《Improving Coreference Resolution by Learning Entity-Level Distributed Representations》

段落:

  1. System Architecture
  2. Building Representations
    3.1. Mention-Pair Encoder
    3.2. Cluster-Pair Encoder
  3. Mention-Ranking Model
  4. Cluster-Ranking Model
    5.1. Cluster-Ranking Policy Network
    5.2. Easy-First Cluster Ranking
    5.3 Deep Learning to Search
  5. Experiments and Results
    6.1. Mention-Ranking Model Experiments
    6.2. Cluster-Ranking Model Experiments

Abstract

A long-standing challenge in coreference resolution has been the incorporation of entity-level information – features defined over clusters of mentions instead of mention pairs. We present a neural network based coreference system that pro- duces high-dimensional vector representations for pairs of coreference clusters. Using these representations, our system learns when combining clusters is desirable. We train the system with a learning-to-search algorithm that teaches it which local decisions (cluster merges) will lead to a high-scoring final corefer-ence partition. The system substantially outperforms the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task dataset despite using few hand-engineered features.

指代消解中长期存在的挑战是实体级信息的整合 - 在mention集群而不是提及对上定义的特征。我们提出了一种基于神经网络的指代消解系统,该系统可以为共参照簇对生成高维矢量表示。使用这些表示,我们的系统可以了解何时组合集群是可行的。我们使用学习搜索算法训练系统,该算法教会哪些本地决策(群集合并)将导致高分最终核心分区。尽管使用了少量手工设计的功能,该系统在CoNLL 2012共享任务数据集的英文和中文部分上大大优于当前最新技术水平。

Coreference resolution, the task of identifying which mentions in a text refer to the same real- world entity, is fundamentally a clustering prob-lem. However, many recent state-of-the-art coref- erence systems operate solely by linking pairs of mentions together (Durrett and Klein, 2013; Martschat and Strube, 2015; Wiseman et al., 2015).

指代消解,即识别文本中mention哪些参考同一现实世界实体的任务,从根本上说是一个聚类问题。然而,许多最近最先进的核心系统仅通过将一对提及的方式联系起来

An alternative approach is to use agglomera- tive clustering, treating each mention as a single- ton cluster at the outset and then repeatedly merg- ing clusters of mentions deemed to be referring to the same entity. Such systems can take advan- tage of entity-level information, i.e., features be- tween clusters of mentions instead of between just two mentions. As an example for why this is use- ful, it is clear that the clusters {Bill Clinton} and{Clinton, she} are not referring to the same entity, but it is ambiguous whether the pair of mentions Bill Clinton and Clinton are coreferent.

另一种方法是使用聚集聚类,在开始时将每个mention视为一个单一的聚类,然后反复合并被认为是指同一实体的提及聚类。这样的系统可以利用实体​​级信息,即提及的集群之间的特征,而不是仅仅两个mention之间的特征。作为一个有用的原因的一个例子,很明显,集群{Bill Clinton}和{Clinton, she}并不是指同一个实体,但是提到比尔克林顿和克林顿是否具有共识是不明确的。

Previous work has incorporated entity-level in- formation through features that capture hard con- straints like having gender or number agreement between clusters (Raghunathan et al., 2010; Dur- rett et al., 2013). In this work, we instead train a deep neural network to build distributed represen- tations of pairs of coreference clusters. This cap- tures entity-level information with a large numb

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值