哥伦比亚大学Coursera课程Natural Language Processing:Quiz 2: covers material from weeks 3 and 4

本文深入探讨了上下文无关文法、概率文法模型及其在句法分析中的应用,包括句子的解析树数量计算、概率计算、左分支树概率最大化以及构成成分的确定。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Quiz 2: covers material from weeks 3 and 4Help Center

Warning: The hard deadline has passed. You can attempt it, but you will not get credit for it. You are welcome to try it as a learning exercise.

Question 1

Say we have a context-free grammar with start symbol S, and the following rules:
  •  NP VP
  • VP   Vt NP
  • Vt   saw
  • NP   John
  • NP   DT NN
  • DT   the
  • NN   dog
  • NN   cat
  • NN   house
  • NN   mouse
  • NP   NP CC NP
  • CC   and
  • PP   IN NP
  • NP   NP PP
  • IN   with
  • IN   in
How many parse trees do each of the following sentences have under this grammar? 
  1. John saw the cat and the dog
  2. John saw the cat and the dog with the mouse
  3. John saw the cat with the dog and the mouse
Write your answer as 3 numbers separated by spaces. For example if you think sentence 1 has 2 parses, sentence 2 has 5 parses, and sentence 3 has 3 parses, you would write 

2 5 3

Question 2

Say we have a PCFG with start symbol S, and the following rules with associated probabilities:
  • q(S   NP VP) = 1.0
  • q(VP   Vt NP) = 1.0
  • q(Vt   saw) = 1.0
  • q(NP   John) = 0.25
  • q(NP   DT NN) = 0.25
  • q(NP   NP CC NP) = 0.3
  • q(NP   NP PP) = 0.2
  • q(DT   the) = 1.0
  • q(NN   dog) = 0.25
  • q(NN   cat) = 0.25
  • q(NN   house) = 0.25
  • q(NN   mouse) = 0.25
  • q(CC   and) = 1.0
  • q(PP   IN NP) = 1.0
  • q(IN   with) = 0.5
  • q(IN   in) = 0.5

Now assume we have the following sentence:

  • John saw the cat and the dog with the mouse
Which of these statements is true?

Question 3

Consider the CKY algorithm for parsing with PCFGs. The usual recursive definition in this algorithm is as follows: 

π(i,j,X)=maxs{i(j1)}XYZR,(q(XYZ)×π(i,s,Y)×π(s+1,j,Z))  

Now assume we'd like to modify the CKY parsing algorithm to that it returns the maximum probability for any *left-branching* tree for an input sentence. Here are some example left-branching trees:  image

It can be seen that in left-branching trees, whenever a rule of the form X   Y Z is seen in the tree, then the non-terminal Z must directly dominate a terminal symbol. 

Which of the following recursive definitions is correct, assuming that our goal is to find the highest probability left-branching tree?

Question 4

image

What are the head words for the following constituents? 

a) The NP "the man" 

b) The PP "with the telescope" 

c) The VP "saw the dog with the telescope" 

d) The S "the man saw the dog with the telescope" 

Write the answer as four words separated by spaces, for example 

the with saw the

Question 5

Say we have a PCFG with start symbol S, and rules and probabilities as follows: 

q(Sa)=0.3  

q(Sa S)=0.7  

For any sentence  x=x1xn , define  T(x)  to be the set of parse trees for  x  under the above PCFG. For any sentence  x , define the probability of the sentence under the PCFG to be 

p(x)=tT(x)p(t)  

where  p(t)  is the probability of the tree under the PCFG. 

Now assume we'd like to define a bigram language model with the same distribution over sentences as the PCFG. What should be the parameter values for  q(a|) q(a|a) , and  q(STOP|a)  so that the bigram language model gives the same distribution over sentences as the PCFG? 

(For this question assume that the PCFG does not need to generate STOP symbols: for example the sentence "a a a" in the PCFG translates to the sentence "a a a STOP" in the bigram language model.) 

Write your answer as a sequence of three numbers, for example 

0.1 0.2 0.1

Question 6

Say we have a PCFG with the following rules and probabilties:

  • q( S  →  NP VP ) = 1. 0

  • q( VP  →  Vt NP ) = 0. 2

  • q( VP  →  VP PP ) = 0. 8

  • q( NP  →  NNP ) = 0. 8

  • q( NP  →  NP PP ) = 0. 2

  • q( NNP  →  John ) = 0. 2

  • q( NNP  →  Mary ) = 0. 3

  • q( NNP  →  Sally ) = 0. 5

  • q( PP  →  IN NP ) = 1. 0

  • q( IN  →  with ) = 1. 0

  • q( Vt  →  saw) = 1. 0

Now say we use the CKY algorithm to find the highest probability parse tree under this grammar for the sentence

  • John saw Mary with Sally

We use tparser to refer to the output of the CKY algorithm on this sentence.

(Note: assume here that we use a variant of the CKY algorithm that can return the highest probability parse under this grammar - don't worry that this grammar is not in Chomsky normal form, assume that we can handle grammars of this form!)

The gold-standard (human-annotated) parse tree for this sentence is

image

What is the precision and recall of tparser (give your answers to 3 decimal places)?

Write your answer as a sequence of numbers: for example “0.3 0.8” would mean that your precision is 0.3, your recall is 0.8.

Here each non-terminal in the tree, excluding parts of speech, gives a "constituent" that is used in the definitions of precision and recall. For example, the gold-standard tree shown above has 7 constituents labeled S, NP, VP, NP, NP, PP, NP respectively (we exclude the parts of speech NNP, IN, and Vt).

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值