Classifying Temporal Relations by Bidirectional LSTM over Dependency Paths

本文介绍了一种在跨句子场景下进行时间关系抽取的方法,该方法由日本国立情报学研究所的Fei Cheng和Yusuke Miyao提出。通过引入‘common root’概念,解决了基于依存路径表示的跨句实体间关系抽取难题。实验未详细探讨,但论文在不依赖外部知识和人工标注实体属性的情况下,取得了良好效果。
部署运行你感兴趣的模型镜像

虽然要写这篇文章,但是有些地方仍然没有了解,比如对TimeBank-Dense。因为以后没有用这个语料的可能性,所以就直接忽略了。

这篇论文是2017ACL上的一篇短文,作者为来自日本国立情报学研究所的Fei Cheng和Yusuke Miyao。

在引言部分,作者介绍了这篇论文的任务。时间关系抽取主要是判别时间实体对之间是否存在某种关系。实体有两种,events和temporal expressions。能组成关系的有event-event(E-E), event-time(E-T)和event-DCT(document creation time, E-D)。论文在没有利用外部知识和人工标注的实体属性的情况下取得了较好的性能。

在方法部分,作者介绍了跨句子在语料中所占比例还是很大的,并且因为作者采用了基于依存路径的输入,一般依存路径都是以一句话为单位的,所以如何表示跨句子实体间的依存路径是一个挺大的障碍。于是了作者假设两个相邻的句子共享一个“common root”,这样就可以表示了,具体如下图所示:
1178414-20171130202717492-1384735598.png

从这个图可以很清楚的看出,首先两个句子都表示为依存路径,然后共享一个“common root”。这样就成功解决了跨句子的问题。依存路径由Stanford CoreNLP工具进行解析得到。输入的词向量为词向量、词性向量和依存关系向量的拼接。

E-E,E-T使用同一个分类器,从源目标词到common root和从目的目标词到common root各自使用一个双向的LSTM。E-D包含一个事件依存路径分支,只使用了一个分支的双向LSTM。具体看下图就能了解:
1178414-20171130202729414-553155215.png

实验部分没详细了解,这也是前面我说的没有了解的地方。以后有机会去做的话还是需要认真了解的。

这篇论文的介绍到这里就结束了,如有理解错误的地方,欢迎批评指正。

转载于:https://www.cnblogs.com/WanJiaJia/p/7931503.html

您可能感兴趣的与本文相关的镜像

Langchain-Chatchat

Langchain-Chatchat

AI应用
Langchain

Langchain-Chatchat 是一个基于 ChatGLM 等大语言模型和 Langchain 应用框架实现的开源项目,旨在构建一个可以离线部署的本地知识库问答系统。它通过检索增强生成 (RAG) 的方法,让用户能够以自然语言与本地文件、数据库或搜索引擎进行交互,并支持多种大模型和向量数据库的集成,以及提供 WebUI 和 API 服务

### Continuous Learning Convolutional LSTM Implementation and Explanation Continuous Learning Convolutional Long Short-Term Memory (ConvLSTM) networks are an advanced form of recurrent neural network designed specifically for spatiotemporal data processing tasks such as time series predictions or modeling complex systems like fluid dynamics[^1]. These models extend traditional LSTMs by incorporating convolution operations that allow them to capture spatial dependencies within sequential inputs. #### Architecture Overview A typical CL ConvLSTM architecture consists of multiple layers where each layer contains several memory cells arranged in a grid structure. Each cell maintains its own state over time while also interacting with neighboring cells through convolutions applied across both space and channels dimensions: - **Input Gate**: Controls how much new information enters into the current cell. - **Forget Gate**: Determines which parts of previous states should be retained or discarded. - **Output Gate**: Regulates what portion of updated internal activations will contribute towards final outputs at this timestep. The key difference between standard RNNs/LSTMs lies in applying filters locally rather than globally when updating gates' values; thus enabling better handling multi-dimensional structured sequences without losing important contextual details during propagation steps [^3]. #### Mathematical Representation Given input \(X\) representing frames from video clips or other forms of temporal imagery datasets along with corresponding labels/targets denoted as \(Y\). The objective function aims to minimize prediction errors via backpropagation-through-time algorithm after training deep architectures composed primarily out these specialized units according to following equation set: \[ \text{For } t=0,...,T-1:\\ i_t=\sigma(W_{ix} * X_t+W_{ih} h_{t-1})\\ f_t=\sigma(W_{fx}*X_t + W_{fh}h_{t−1}) \\ c'_t=tanh(W_{cx}*X_t+W_{ch}h_{t−1})\\ o_t=\sigma(W_{ox}*X_t+W_{oh}h_{t−1})\\ c_t=f_tc_{t−1}+i_tc′_t\\ h_t=o_th(c_t)\] where \(W_*\) represents weight matrices associated with respective gate mechanisms (\(i,f,c',o\)), "\(*\)" denotes element-wise multiplication operation performed on feature maps generated post-convolving kernels against incoming samples concatenated alongside hidden representations obtained previously inside same unit type but different positions throughout entire sequence length T. #### Code Example Below demonstrates Python code implementing basic version using TensorFlow/Keras API framework: ```python import tensorflow as tf from tensorflow.keras.layers import Input, ConvLSTM2D, Dense, Flatten from tensorflow.keras.models import Model def build_conv_lstm(input_shape=(None, 64, 64, 1)): inp = Input(shape=input_shape) x = ConvLSTM2D(filters=64, kernel_size=(3, 3), padding='same', return_sequences=True)(inp) x = ConvLSTM2D(filters=64, kernel_size=(3, 3), padding='same')(x) flat = Flatten()(x[:,-1,:,:,:]) dense_out = Dense(units=10, activation="softmax")(flat) model = Model(inputs=[inp], outputs=[dense_out]) return model model = build_conv_lstm() model.compile(optimizer='adam', loss='sparse_categorical_crossentropy') print(model.summary()) ``` This script defines a simple two-layered ConvLSTM-based classifier suitable for classifying short videos represented as stacks of grayscale images having fixed width-height resolution paired together forming batches fed sequentially one frame per timestamp until reaching end-of-sequence marker indicating completion status accordingly [^2].
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值