- 词向量得到句向量
1)bag of words求平均
2)TF-IDF加权平均
3)SIF加权平均

That is, the MLE is approximately a weighted average of the vectors of the words in the sentence.Note that for more frequent words w, the weight a/(p(w) + a) is smaller, so this naturally leads to a down weighting of the frequent words.


To estimate cs, we estimate the direction c0 by computing the first principal component of c˜s’s for a set of sentences. In other words, the final sentence embedding is obtained by subtracting the projection of c˜s’s to their first principal component.


- 直接得到句向量
1)Encoder:RNN/LSTM得到序列末尾的hidden vector;若双层,则concat得到的两个hidden vector
RNNs using long short-term memory (LSTM) capture long-distance dependency and have also been used for modeling sentences (Tai et al., 2015)。
2)BERT:[CLS]对应位置的输出即为句向量
3)skip-thought vectors:Skip-thought of (Kiros et al., 2015) tries to reconstruct the surrounding sentences from surrounded one and treats the hidden parameters as their vector representations.
本文介绍了几种常见的句向量生成方法,包括通过词向量的简单平均、TF-IDF加权平均、SIF加权平均等传统方法,以及利用RNN/LSTM、BERT和skip-thought vectors等深度学习模型直接生成句向量的技术。
2874

被折叠的 条评论
为什么被折叠?



