机器学习（10.7-10.13）(Pytorch LSTM和LSTMP的原理及其手写复现)

最新推荐文章于 2025-06-11 23:07:46 发布

Nil_cxc

最新推荐文章于 2025-06-11 23:07:46 发布

阅读量1k

点赞数 17

文章标签： lstm 机器学习 pytorch

本文链接：https://blog.youkuaiyun.com/weixin_51923997/article/details/142884456

版权

文章目录

摘要

LSTM是RNN的一个优秀的变种模型，继承了大部分RNN模型的特性；在本次学习中，展示了LSTM的手动推导过程，用代码逐行模拟实现LSTM的运算过程，并与Pytorch API输出的结验证是否保持一致。

Abstract

LSTM is an excellent variant model of RNN, inheriting most of the characteristics of RNN model. In this study, the manual derivation process of LSTM is demonstrated, and the operation process of LSTM is simulated line by line with the code, and whether it is consistent with the junction output of Pytorch API is verified.

1 LSTM

1.1 使用Pytorch LSTM

实例化LSTM类，需要传递的参数：

input_size：输入数据的特征维度
hidden_size：隐含状态 $h_t$ 的大小
num_layers：默认值为1，大于1，表示多个RNN堆叠起来
batch_first：默认是False；若为True,输入输出格式为：(batch, seq, feature) ；若为False：输入输出格式为： (seq, batch, feature)；
bidirectional：默认为False；若为True，则是双向RNN，同时输出长度为2*hidden_size
proj_size：若不为None，将使用具有相应大小投影的 LSTM

函数输入值：（input，（h_0,c_0））

input：当batch_first=True 输入格式为（N,L, $H_{in}$ ）;当batch_first=False 输入格式为（L,N, $H_{in}$ ）;
h_0：默认输入值为0；输入格式为：（D*num_layers,N, $H_{out}$ ）
c_0：默认输入值为0，输入格式为：（D*num_layers,N, $H_{cell}$ ）

其中：

N = batch size
L = sequence length
D = 2 if bidirectional=True otherwise 1
$H_{in}$ = input_size
$H_{out}$ = hidden_size
$H_{cell}$ = proj_size if proj_size>0 否则 hidden_size

函数输出值：（output，（h_n, c_n））

output：当batch_first=True 输出格式为（N，L，D* $H_{out}$ ）;当batch_first=False 输出格式为（L，N，D* $H_{out}$ ）;
h_n：输出格式为：（D*num_layers，N， $H_{out}$ ）
c_n：输出格式为：（D*num_layers，N， $H_{cell}$

最低0.47元/天解锁文章