INFO: Sutskever2014_Sequence to Sequence Learning with Neural Networks
ABSTRACT
- Use one LSTM to read the input sequence, one timestep at a time, to obtain large fixed-dimensional vector representation, and then to use another LSTM to extract the output sequence from that vector. The second LSTM is essentially a recurrent neural network language model except that it is conditioned on the input sequence.
- It is not clear how to apply an RNN to problems whose input and the output sequences have different lengths with complicated and non-monotonic relationships.
- Reversing the input sentences results in LSTMs with better memory utilization.
(Instead of mapping the sentence a, b, c to the sentence α, β, γ, the LSTM is asked to map c, b, a to α, β, γ, where α, β, γ is the translation of a, b, c.)
RELEVANT INFORMATION:
- Encoder - Decoder: