目录
Problems with POS Tagging 词性标注的问题
-
Exponentially many combinations: |Tags|M, for length M 组合数量呈指数级增长:|Tags|M,长度为M
-
Tag sequences of different lengths 标记不同长度的序列
-
Tagging is a sentence-level task but as humans we decompose it into small word-level tasks 标注是句级任务,但作为人类,我们将其分解为小型的词级任务
-
Solution:
- Define a model that decomposes process into individual word-level tasks steps. But this takes into account the whole sequence when learning and predicting. 定义一个模型,将过程分解为单个词级任务步骤。但在学习和预测时,考虑整个序列
- This is called sequence labelling, or structured prediction 这被称为序列标注,或结构预测
Probabilistic Model of HMM HMM的概率模型
- Goal: Obtain best tag sequence t from sentence w 目标:从句子w中获取最佳标签序列t
The formulation 表述公式:
Applying Bayes Rule 应用贝叶斯定理:
Decomposing the Elements 分解元素:Probability of a word depends only on the tag 单词的概率只取决于标签:
Probability of a tag depends only on the previous tag 标签的概率只取决于前一个标签:
Two Assumptions of HMM HMM的两个假设
-
Output independence: An observed event(word) depends only on the hidden state(tag) 输出独立性:观察到的事件(词)只取决于隐藏状态(标签) ->
-
Markov assumption: The current state(tag) depends only on the previous state 马尔科夫假设:当前状态(标签)只取决于前一个状态->
Training HMM 训练HMM
-
Parameters are individual probabilities: 参数是单个概率
- Emission Probabilities 发射概率 (O):
- Transition Probabilities 转移概率 (A):
- Emission Probabilities 发射概率 (O):
-
Training uses Maximum Likelihood Estimation: Done by simp