目录
2.3 Meaning Through Dictionary
3.1 Word Similarity with Paths
3.4 Concept Probability Of A Node
3.5 Similarity with Information Content
4.3 Unsupervised: Lesk 无监督:Lesk
1. Sentiment Analysis
Bag of words, kNN classifier. Training data 词袋模型,kNN分类器。训练数据:
- “This is a good movie.” → ☺
- “This is a great movie.” → ☺
- “This is a terrible film.” → ☹
- “This is a wonderful film.” → ?
Two problems:
- The model does not know that "movie" and "film' are synonyms. Since "film" appears only in negative examples the model learns that it is a negative word. 模型不知道“电影”和“影片”是同义词。因为“影片”只出现在负面例子中,所以模型认为它是一个负面词。
- "wonderful" is not in the vocabulary (OOV ---- Out-Of-Vocabulary). “wonderful”不在词汇表里
Comparing words directly will not work. How to make sure we compare word meanings instead?
Solution: add this information explicitly through a lexical database. 通过 lexical database 显式添加此信息。
2. Lexical Database
2.1 What is Lexical Database
Their dictionary definition
- But dlictionary definitions are necessarily circular
- Only useful if meaning is already understood
Their relationships with other words
- Also circular, but better for text analysis
2.2 Definitions
A word sense describes one aspect of the meaning of a word 词义描述了一个词的意义的一个方面
If a word has multiple senses, it is polysemous 如果一个词有多个义项,它就是多义词
2.3 Meaning Through Dictionary
Gloss: textual definition of a sense, given by a dictionary
Bank
- financial institution that accepts deposits and channels the money into lending activities
- sloping land (especially the slope beside a body of water)
Another way to define meaning: by looking