RNNCell
nn.RNNCell(input_size, hidden_size, bias=True, nonlinearity=‘tanh’)
h′=tanh(Wihx+bih+Whhh+bhh) h^{\prime}=\tanh \left(W_{i h} x+b_{i h}+W_{h h} h+b_{h h}\right) h′=tanh(Wihx+bih+Whhh+bhh)
- input_size:输入数据X的特征值的数目。
- hidden_size:隐藏层的神经元数量,也就是隐藏层的特征数量。
- bias:默认为 True,如果为 false 则表示神经元不使用 bias 偏移参数。
- nonlinearity:默认为tanh,可选relu
输入:
- input:[batch,input_size]
- hidden:[batch,hidden_size]
输出:
- h′h^{'}h′:[batch,hidden_size]
参数:
- RNNCell.weight_ih: [hidden_size, input_size]
- RNNCell.weight_hh: [hidden_size, hidden_size]
- RNNCell.bias_ih: [hidden_size]
- RNNCell.bias_hh: [hidden_size]
#输入特征维度5,输出维度10
rnn_cell = torch.nn.RNNCell(5,10)
#Batch_size=2
input = torch.randn(2,5)
h_0 = torch.randn(2,10)
h = rnn_cell(input,h_0)
h.shape
>>torch.Size([2, 10])
[(para[0],para[1].shape) for para in list(rnn_cell.named_parameters())]
>>[('weight_ih', torch.Size([10, 5])),
('weight_hh', torch.Size([10, 10])),
('bias_ih', torch.Size([10])),
('bias_hh', torch.Size([10]))]
RNN
torch.nn.RNN(args, kwargs)*
ht=tanh(Wihxt+bih+Whhh(t−1)+bhh) h_{t}=\tanh \left(W_{i h} x_{t}+b_{i h}+W_{h h} h_{(t-1)}+b_{h h}\right) ht=tanh(Wihxt+bih+Whhh(t−1)+bhh)
- input_size:输入数据X的特征值的数目。
- hidden_size:隐藏层的神经元数量,也就是隐藏层的特征数量。
- num_layers:循环神经网络的层数,默认值是 1。
- nonlinearity:默认为tanh,可选relu
- bias:默认为 True,如果为 false 则表示神经元不使用 bias 偏移参数。
- batch_first:如果设置为 True,则输入数据的维度中第一个维度就 是 batch 值,默认为 False。默认情况下第一个维度是序列的长度, 第二个维度才是 - - batch,第三个维度是特征数目。
- dropout:如果不为空,则表示最后跟一个 dropout 层抛弃部分数据,抛弃数据的比例由该参数指定。默认为0。
- bidirectional : If
True
, becomes a bidirectional RNN. Default:False
输入:
- input: [seq_len, batch, input_size]
- h0h_{0}h0: [(num_layers * num_directions, batch, hidden_size)]
输出:
- out: [seq_len, batch, num_directions * hidden_size]
- hnh_{n}hn: [num_layers * num_directions, batch, hidden_size]
参数:
- RNN.weight_ih_l[k]: 第0层[hidden_size, input_size],之后为[hidden_size, num_directions * hidden_size]
- RNN.weight_hh_l[k]: [hidden_size, hidden_size]
- RNN.bias_ih_l[k]: [hidden_size]
- RNN.bias_hh_l[k]: [hidden_size]
#输入特征维度5,输出维度10, 层数2
rnn = torch.nn.RNN(5, 10, 2)
#seq长度4,batch_size=2
input = torch.randn(4 , 2 , 5)
h_0 =torch.randn(2 , 2 , 10)
output,hn=rnn(input ,h_0)
print(output.size(),hn.size())
>>torch.Size([4, 2, 10]) torch.Size([2, 2, 10])
[(para[0],para[1].shape) for para in list(rnn.named_parameters())]
>>[('weight_ih_l0', torch.Size([10, 5])),
('weight_hh_l0', torch.Size([10, 10])),
('bias_ih_l0', torch.Size([10])),
('bias_hh_l0', torch.Size([10])),
('weight_ih_l1', torch.Size([10, 10])),
('weight_hh_l1', torch.Size([10, 10])),
('bias_ih_l1', torch.Size([10])),
('bias_hh_l1', torch.Size([10]))]
rnn = torch.nn.RNN(5, 10, 2,bidirectional=True)
>>[('weight_ih_l0', torch.Size([10, 5])),
('weight_hh_l0', torch.Size([10, 10])),
('bias_ih_l0', torch.Size([10])),
('bias_hh_l0', torch.Size([10])),
('weight_ih_l0_reverse', torch.Size([10, 5])),
('weight_hh_l0_reverse', torch.Size([10, 10])),
('bias_ih_l0_reverse', torch.Size([10])),
('bias_hh_l0_reverse', torch.Size([10])),
('weight_ih_l1', torch.Size([10, 20])),
('weight_hh_l1', torch.Size([10, 10])),
('bias_ih_l1', torch.Size([10])),
('bias_hh_l1', torch.Size([10])),
('weight_ih_l1_reverse',