LSTM,long short term memory,长短时记忆元
与GRU相比,它是3个门,GRU是2个门。GRU更简单,易于大规模应用;LSTM准确性比GRU更好。到底用哪个,需要权衡。
基于Tensorflow的python代码片段:
def unit(x, hidden_memory_tm1):
previous_hidden_state, c_prev = tf.unstack(hidden_memory_tm1)
# Input Gate
i = tf.sigmoid(
tf.matmul(x, self.Wi) +
tf.matmul(previous_hidden_state, self.Ui) + self.bi
)
# Forget Gate
f = tf.sigmoid(
tf.matmul(x, self.Wf) +
tf.matmul(previous_hidden_state, self.Uf) + self.bf
)
# Output Gate
o = tf.sigmoid(
tf.matmul(x, self.Wog) +
tf.matmul(previous_hidden_state, self.Uog) + self.bog
)
# New Memory Cell
c_ = tf.nn.tanh(
tf.matmul(x, self.Wc) +
tf.matmul(previous_hidden_state, self.Uc) + self.bc
)
# Final Memory cell
c = f * c_prev + i * c_
# Current Hidden state
current_hidden_state = o * tf.nn.tanh(c)
return tf.stack([current_hidden_state, c])