Week 13: 深度学习补遗：RNN的训练

最新推荐文章于 2026-01-09 21:54:10 发布

原创

最新推荐文章于 2026-01-09 21:54:10 发布 · 340 阅读

6 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #rnn #人工智能

文章目录

Week 13: 深度学习补遗：RNN的训练

Week 13: 深度学习补遗：RNN的训练

摘要

本周主要跟随李宏毅老师的课程进度，继续学习了RNN的原理部分内容，对数学本质与底层逻辑方面知识继续进行深挖，对BPTT算法和梯度消失问题的解决方法进行了学习。

Abstract

This week, we mainly followed the course progress of Professor Hung-yi Lee and continued to study the principles of RNN. We continued to delve deeper into the mathematical essence and underlying logic, and learned about the BPTT algorithm and solutions to the gradient vanishing problem.

1. RNN的训练

以Slot Filling为例，对于当前的词汇 $x^i$ ，RNN输出向量 $\hat{y^i}$ ，代表其属于某个Slot的可能性，即求 $y^i$ 与 $\hat{y^i}$ 的交叉熵损失函数。将多个词汇的损失求和即为网络的损失函数，需要注意的是，不可以打乱词汇的语序，因为RNN的前后文之间会相互影响，也就意味着 $x^{i+1}$ 需要紧跟着 $x^i$ 输入。