LSTM神经网络相关

Hz、辉

于 2023-09-23 10:54:54 发布

阅读量576

点赞数

文章标签：神经网络 lstm 人工智能

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/weixin_54720578/article/details/133200411

版权

GRU + LSTM + ConvLSTM 从数学公式上理解

一、GRU理解（公式+手推代码）

1.1 GRU图

1.2 GRU公式（附带各变量的维度解释）

注意：** 是内积 ·是向量相乘*

(1)重置门：

Xt ：输入向量（inputdim，1）

Ht-1：上一个GRU的输出向量（上一个gru的输出维度(hiddendim)，1）

Wxr: （hiddendim，inputdim）

Whr：（hiddendim，上一个gru的输出维度(hiddendim)）

br ：（hiddendim，1）

rt ：（hiddendim，1)
$r t = s i g m o i d (W x r \cdot Xt + Wh r \cdot H t - 1 + b r)$
(2)更新门：

Wxz: （hiddendim，inputdim）

Whz：（hiddendim，上一个gru的输出维度(hiddendim)）

bz ：（hiddendim，1）

zt ：（hiddendim，1)
$z t = s i g m o i d (W x z \cdot Xt + Wh z \cdot H t - 1 + b z)$

(3)hat ：

Wxz: （hiddendim，inputdim）

Whz：（hiddendim，上一个gru的输出维度(hiddendim)）

bn ：（hiddendim，1）

hat ：（hiddendim，1）
$ha t = t anh (W x h \cdot Xt + Whh \cdot （ r t * H t - 1 ） + bn)$
(4)输出：

Ht：（hiddendim，1）
$H t = z t * H t - 1 + （ 1 - z t ） * ha t$

1.3 GRU手推

https://gitee.com/Hz092811/neural-network/blob/master/%E6%89%8B%E5%86%99GRU.ipynb

二、LSTM理解（公式+手推代码）

2.1 LSTM图

2.2 LSTM公式

(1)遗忘门：

Xt ：输入向量（inputdim，1）

Ht-1：上一个LSTM的输出向量（上一个LSTM的输出维度(hiddendim)，1）

Wf：（hiddendim，inputdim + 上一个LSTM的输出维度(hiddendim)）

bf：（hiddendim，1）
$f t = s i g m o i d (W f \cdot [Xt, H t - 1] + b f)$
(2)更新门：

Xt ：输入向量（inputdim，1）

Ht-1：上一个LSTM的输出向量（上一个LSTM的输出维度(hiddendim)，1）

Wi：（hiddendim，inputdim + 上一个LSTM的输出维度(hiddendim)）

bi：（hiddendim，1）
$i t = s i g m o i d (Wi \cdot [Xt, H t - 1] + bi)$
(3)输出门：

Xt ：输入向量（inputdim，1）

Ht-1：上一个LSTM的输出向量（上一个LSTM的输出维度(hiddendim)，1）

Wo：（hiddendim，inputdim + 上一个LSTM的输出维度(hiddendim)）

bo：（hiddendim，1）
$o t = s i g m o i d (W o \cdot [Xt, H t - 1] + b o)$
(4)状态向量：

ft ：（hiddendim，1）

ct-1：（hiddendim，1）

it：（hiddendim，1）

ct ：（hiddendim，1）
$c t = f t * c t - 1 + i t * t anh (W c \cdot [Xt, H t - 1] + b c)$
(5)输出：

ht : （hiddendim，1）

ot : (hiddendim，1)

ct : (hiddendim，1)
$h t = o t * t anh (c t)$

三、ConvLSTM理解（公式）

重点：将LSTM的向量相乘操作换成了卷积操作（因为ConvLSTM的输入为三维向量，为了提取空间特征，采用卷积操作）

**表示卷积操作，W为卷积核参数

(1)遗忘门：

Xt ：输入向量（n,m,inputdim）

Ht-1：（n,m,hiddendim）

Wf：卷积核

bf：（n,m，hiddendim）

ft：（n,m，hiddendim）
$f t = s i g m o i d (W f * * [Xt, H t - 1] + b f)$
(2)更新门：

Xt ：输入向量（n,m,inputdim）

Ht-1：（n,m,hiddendim）

Wi：卷积核

bi：（n,m，hiddendim）

it：（n,m，hiddendim）
$i t = s i g m o i d (Wi * * [Xt, H t - 1] + bi)$
(3)输出门：

Xt ：输入向量（n,m,inputdim）

Ht-1：（n,m,hiddendim）

Wo：卷积核

bo：（n,m，hiddendim）

io：（n,m，hiddendim）
$o t = s i g m o i d (W o * * [Xt, H t - 1] + b o)$
(4)状态向量：

ft ：（n,m，hiddendim）

ct-1：（n,m，hiddendim）

it：（n,m，hiddendim）

ct ：（n,m，hiddendim）
$c t = f t * c t - 1 + i t * t anh (W c * * [Xt, H t - 1] + b c)$
(5)输出：

ht : （n,m，hiddendim）

ot : (n,m，hiddendim)

ct : (n,m，hiddendim)
$$
ht = ot * tanh(ct)
,m，hiddendim）

ct ：（n,m，hiddendim）
$c t = f t * c t - 1 + i t * t anh (W c * * [Xt, H t - 1] + b c)$
(5)输出：

ht : （n,m，hiddendim）

ot : (n,m，hiddendim)

ct : (n,m，hiddendim)
$h t = o t * t anh (c t)$

博客等级

码龄4年

43
原创

83
点赞

81
收藏

375
粉丝

关注

私信

热门文章

分类专栏

深度学习那些细节点 2篇
nlp学习

展开全部收起

上一篇：: grpc的go和c++示例

下一篇：: 音视频学习(1)

最新评论

算法-排序
优快云-Ada助手: 不知道算法技能树是否可以帮到你：https://edu.youkuaiyun.com/skill/algorithm?utm_source=AI_act_algorithm
UDP穿透
m0_56898824: 请问第五步有详细的代码吗
conda操作指南
优快云-Ada助手: 恭喜您写了第20篇博客，题目为“conda操作指南”。阅读您的文章，我深感您的实用指导对于我们这些初学者非常有帮助。您用简单明了的语言详细介绍了conda的操作方法，让我们能够更高效地使用这一工具。感谢您的分享。希望您能够继续坚持创作，分享更多有价值的内容。我建议您可以写一些关于数据分析或者机器学习方面的文章，这是当前热门的话题，也是很多人期待学习的领域。期待您的下一篇精彩文章。优快云会根据你创作的博客的质量，给予优秀的博主博客红包奖励。请关注 https://bbs.youkuaiyun.com/forums/csdnnews?typeId=116148&utm_source=csdn_ai_ada_blog_reply20 看奖励名单。

大家在看

最新文章

目录

展开全部

收起

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。