关于RNNLM的思考，特别是与HMM，n-gram的区别

最新推荐文章于 2021-07-25 22:15:39 发布

sunfoot001

最新推荐文章于 2021-07-25 22:15:39 发布

阅读量3.2k

点赞数

分类专栏： NLP 文章标签：语言模型 RNN HMM n-gram

NLP 专栏收录该内容

20 篇文章

订阅专栏

本文探讨了循环神经网络(RNN)与隐马尔科夫模型(HMM)的区别及优势，RNN理论上可以考虑长期依赖关系，但在训练过程中可能会遇到梯度消失问题，通过使用LSTM等方法可以缓解这一问题；相较于HMM，RNN具有更强的表现力并能进行更智能的平滑处理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

来自Quora.

1. RNN do not make the Markov assumption and so can, in theory, take into account long-term dependencies when modeling natural language.
但训练RNN也会面临gradient vanish问题，怎么解决，用LSTM吗？

2. The main advantages of using a recurrent neural network over Markov chains and hidden Markov model would be the greater representational power of neural networks and their ability to perform intelligent smoothing by taking into account syntactic and semantic features (see for example Turian et al.). By comparison n-grams have a number of parameters exploding with the size of the vocabulary and n and rely on simple smoothing techniques like Kneser–Ney or Good–Turing. I would add (for what it's worth) that I kind of think of the Hidden Markov model vs Recurrent Neural Network "battle" as being similar to a mixture model vs Feed-Forward Neural Network "battle".