【论文笔记】Attention总结二：Attention本质思想 + Hard/Soft/Global/Local形式Attention

本文链接：https://blog.youkuaiyun.com/changreal/article/details/102518702

Attention总结二：

涉及论文：

Show, Attend and Tell: Neural Image Caption Generation with Visual Attentio（用了hard\soft attention attention）

Effective Approaches to Attention-based Neural Machine Translation（提出了global\local attention）

本文参考文章：

Attention - 之二
 不得不了解的五种Attention模型方法及其应用
 attention模型方法综述
 Attention机制论文阅读——global attention和local attention
Global Attention / Local Attention

本文摘要

attention机制本质思想
总结各attention机制（hard\soft\global\local attention）
attention其他相关

1 Attention机制本质思想

本质思想见：这篇文章，此文章中也说了self-attention。
简答来说attention就是(query, key ,value)在机器翻译中key-value是一样的。
PS：NMT中应用的Attention机制基本思想见论文总结：Attentin总结一

2 各种attention

来说一下其他的attention：

hard attention
soft attention
gloabal attention
local attention
self-attention:target = source -> Multi-head attention -（放attention总结三）

2.1 hard attention

论文：Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.
hard attention结构
笔记来源：attention模型方法综述

soft attention是保留所有分量进行加权，hard attention是以某种策略选取部分分量。hard attention就是关注部分。
soft attention就是后向传播来训练。

hard attention的特点：
the hard attention model is non-differentiable and requires more complicated techniques such as variance reduction or reinforcement learning to train