PCFG
Need for PCFG
- Time flies like an arrow
- Many parses
- Some more likely than others
- Need for a probabilistic ranking method
Definition
Just like CFG, a 4 tuple(N,Σ,R,S)
- N: non-terminal symbols
- Σ: terminal symbols(disjoint from N)
- R: rules(A→β)[p]
- β∈(Σ∪N)∗
- p is the probability p(β|A)
- S: start symbol(from N)
Rules having the same left-hand side should have probabilities summing to 1.
Probability of a parse tree
p(t)=∏i=1np(αi→βi)
Most likely parse tree
argmaxt∈T(s)p(t)
Probability of the sentence
p(s)=∑i=1np(ti)
Main tasks for PCFGs
Given a grammar G and sentence s, let T(s) be all parse trees that correspond to s
Task1: find the most likely parse tree t
- Task2: find p(s) as the sum of all p(t)
Probabilistic parsing methods
- Probabilistic Earley algorithm
- Top-down parser with dynamic programming table
- Probabilistic CKY algorithm
- Bottom-up parser with a dynamic programming table
Probabilistic grammars
- Possibilities can be learned from the training(Treebank)
- Possible to do reranking
- Possible to combine with other stages
MLE
pML(α→β)=Count(α→β)Count(α)