文章目录
摘要
LSTM是RNN的一个优秀的变种模型,继承了大部分RNN模型的特性;在本次学习中,展示了LSTM的手动推导过程,用代码逐行模拟实现LSTM的运算过程,并与Pytorch API输出的结验证是否保持一致。
Abstract
LSTM is an excellent variant model of RNN, inheriting most of the characteristics of RNN model. In this study, the manual derivation process of LSTM is demonstrated, and the operation process of LSTM is simulated line by line with the code, and whether it is consistent with the junction output of Pytorch API is verified.
1 LSTM
1.1 使用Pytorch LSTM
实例化LSTM类,需要传递的参数:
input_size:输入数据的特征维度
hidden_size:隐含状态 h t h_t ht的大小
num_layers:默认值为1,大于1,表示多个RNN堆叠起来
batch_first:默认是False;若为True
,输入输出格式为:(batch, seq, feature) ;若为False
:输入输出格式为: (seq, batch, feature);
bidirectional:默认为False;若为True,则是双向RNN,同时输出长度为2*hidden_size
proj_size:若不为None,将使用具有相应大小投影的 LSTM
函数输入值:(input,(h_0,c_0))
- input:当
batch_first=True
输入格式为(N,L, H i n H_{in} Hin);当batch_first=False
输入格式为(L,N, H i n H_{in} Hin); - h_0:默认输入值为0;输入格式为:(D*num_layers,N, H o u t H_{out} Hout)
- c_0:默认输入值为0,输入格式为:(D*num_layers,N, H c e l l H_{cell} Hcell)
其中:
N = batch size
L = sequence length
D = 2 if bidirectional=True otherwise 1
H i n H_{in} Hin = input_size
H o u t H_{out} Hout = hidden_size
H c e l l H_{cell} Hcell = proj_size if proj_size>0 否则 hidden_size
函数输出值:(output,(h_n, c_n))
- output:当
batch_first=True
输出格式为(N,L,D* H o u t H_{out} Hout);当batch_first=False
输出格式为(L,N,D* H o u t H_{out} Hout); - h_n:输出格式为:(D*num_layers,N, H o u t H_{out} Hout)
- c_n:输出格式为:(D*num_layers,N, H c e l l H_{cell}