【论文笔记】QANET：Combining Local Convolution With Global Self-attention for Reading Comprehension

changreal

于 2019-12-02 15:16:41 发布

阅读量629

点赞数

CC 4.0 BY-SA版权

分类专栏：论文笔记 NLP 文章标签： NLP QANet MRC 论文笔记

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/changreal/article/details/103349145

目录

1. 简要介绍

3. data augmentation by backtranslation

1. 简要介绍

模型创新点：

（一）移除了RNN，核心就是卷积 + self-attention。这样使得训练更快，相应地模型能使用更多的训练数据。Convolution capture the local structure of context(local interactions), self-attention models global interactions。两者相辅相成，不可替代。

（二）使用了辅助的数据增强技术来提高训练数据，数据来自MT模型的back-translation。

QANet首先达到又快又精确，并且首先把self-attention和convolution结合起来。

QANet结构广泛使用convolutions和self-attentions作为encoders的building blocks，然后分别encode query和context，然后使用standard attentions学习到context和question之间的interactions，结果的representation再次被encode，然后最后decode出起始位置的probability。

组件分析：

convolution： local structure
self-attention：global interaction
additional context-query attention：

它是standard module，从而建立query-aware context vector

QANet结构

主要包括5个组件：input embedding layer,a embedding encoder layer, context-query attention layer, a model encoder layer, an output layer.

与其他MRC模型不同的是：所有embedding和model encoders只使用conv和sefl-attention；

创新的辅助的data augmentation技术：从原始英文翻译为法语后，再翻译回英语，这样不仅提高了训练实例的数量，更提高了措辞多样化。英语翻译为法语后，通过beam decoder，生成k句法语翻译，然后法语翻译再通过beam decoder变回英语就获得了k^2句paraphrases。

最低0.47元/天解锁文章

200万优质内容无限畅学

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。