word to vector学习笔记

qq_41835091

于 2022-01-04 16:55:38 发布

阅读量287

点赞数

分类专栏：机器学习算法文章标签： python 机器学习深度学习霍夫曼树算法

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/qq_41835091/article/details/122306543

版权

机器学习算法专栏收录该内容

4 篇文章

订阅专栏

本文解析了word2vec中CBOW和Skip-Gram模型的基本结构和参数更新，包括单词和多词上下文版本，重点介绍了霍夫曼编码和层次softmax、负采样等优化方法。通过《word2vecParameterLearningExplained》等权威参考，深入浅出地讲解了模型训练的数学原理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

目录

Outline:
手写笔记（待补充）
Main Reference:
Other Reference:

Outline:

The General structure and assumption of two model: CBOW and Skip-Gram.
The simplest version of updating parameters—‘One-word context’ version.
a. 符号表示及优化目标
b. 隐藏层到输出层的参数更新
c. 输入层到隐藏层的参数更新
d. 感性理解
Update parameters version2——‘Multi-word context’ version.
Two ways to improve updating parameters
a. Huffman code & Hierarchical Softmax
b. Negative sampling

手写笔记（待补充）

Main Reference:

《word2vec Parameter Learning Explained》
《word2vec中的数学》
《word2vec Explained: Deriving Mikolov et al.’s
Negative-Sampling Word-Embedding Method》

Other Reference:

The three original paper of Tomas Mikolov:

《Efficient Estimation of Word Representations in Vector Space》
《Distributed-representations-of-words-and-phrases-and-their-compositionality-Paper》
《Distributed Representations of Sentences and Documents》

Some translation link:
a. https://www.cnblogs.com/peghoty/p/3857839.html
b. https://www.cnblogs.com/conan-ai/p/11354926.html

c. https://blog.youkuaiyun.com/u010555997/article/details/76598666
d. https://www.jianshu.com/p/4517181ca9c3
e. https://blog.youkuaiyun.com/lanyu_01/article/details/80097350

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。