TensorFlow2中keras框架下layer对象中封装了大量常见循环神经网络层,如keras.layer.SimpleRNN、keras.layer.RNNcell、keras.layer.LSTM、keras.layer.LSTMcell等等类,其中keras.layer.SimpleRnn、keras.layer.LSTM、keras.layer.GRU类就是我们常说的RNN、LSTM、GRU在TensorFlow2对应的函数,下面对几种循环神经网络的输入输出简单介绍
不同的任务,循环神经网络的输出不同,有时取最后一个时间的输出即可,有时要用到全部时间步的输出
RNN

输出最后一个时间步
输入[batch_size,time_step,hidden_nodes]
输出[batch_size,time_step,hidden_nodes]/[batch_size,hidden_nodes]
inputs = np.random.random([32, 10, 8]).astype(np.float32)
simple_rnn = tf.keras.layers.SimpleRNN(4)
output = simple_rnn(inputs)
输出维度
The output has shape
[32, 4]
.
输出每个时间步状态
simple_rnn = tf.keras.layers.SimpleRNN(
4, return_sequences=True, return_state=True)
whole_sequence_output, final_state = simple_rnn(inputs)
输出维度
whole_sequence_output has shape
[32, 10, 4]
.
final_state has shape[32, 4]
.
LSTM

从图中可以看出LSTM的会得到一个状态 c t c_t ct经LSTM的输出门最终得到 h t h_t ht
输出最后一个时间步
inputs = tf.random.normal([32, 10, 8])
lstm = tf.keras.layers.LSTM(4)
output = lstm(inputs)
print(output.shape)
输出维度
The output has shape
[32, 4]
.
获取每个时间步状态、输出
lstm = tf.keras.layers.LSTM(4, return_sequences=True, return_state=True)
whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)
print(whole_seq_output.shape)
print(final_memory_state.shape)
print(final_carry_state.shape)
输出维度
whole_sequence_output has shape
[32, 10, 4]
.
final_memory_state has shape[32, 4]
.
final_carry_state has shape[32, 4]
.
GRU

输出最后一个时间步
inputs = tf.random.normal([32, 10, 8])
gru = tf.keras.layers.GRU(4)
output = gru(inputs)
print(output.shape)
输出维度
The output has shape(32,4)
输出每个时间步状态
gru = tf.keras.layers.GRU(4, return_sequences=True, return_state=True)
whole_sequence_output, final_state = gru(inputs)
print(whole_sequence_output.shape)
print(final_state.shape)
输出维度
whole_sequence_output has shape(32,10,4)
final_state.shape(32,4)