Karpathy, Andrej, Justin Johnson, and Li Fei-Fei. “Visualizing and understanding recurrent networks” arXiv preprint arXiv:1506.02078 (2015). (Citations: 79).
1 RNN
RNN has form
Where W varies between layers but is shared through time. ⃗ x is the input from the layer below.
It was observed that the back-propagation dynamics caused the gradients in an RNN to either vanish or explode.
2 LSTM
The exploding gradient concern can be alleviated with a heuristic of clipping the gradients, and LSTMs were designed to mitigate the vanishing gradient problem. In addition to a ⃗ LSTMs also maintain a memory vector ⃗ c . At each time step the hidden state vector h, LSTM can choose to read from, write to
本文深入探讨了循环神经网络(RNN)的问题,如梯度消失和爆炸,并重点关注了解决这些问题的两种结构:长短期记忆网络(LSTM)和门控循环单元(GRU)。LSTM通过门控机制实现稳定的学习,允许信息长时间无损地回传。GRU则通过候选隐藏向量和门控机制进行平滑的更新。这些网络在深度学习和计算机视觉等领域有广泛应用。
最低0.47元/天 解锁文章
884

被折叠的 条评论
为什么被折叠?



