编译原理第二版4.3答案

本文探讨了正则表达式的文法结构,包括提取左公因子和消除左递归的过程,以及这些变换如何影响文法的自顶向下语法分析适用性。通过对多个示例文法的分析,展示了如何进行文法转换,以及转换后的文法是否适用于自顶向下的语法分析。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

4.3 节的练习

4.3.1

下面是一个只包含符号 a 和 b 的正则表达式文法。它使用 + 替代表示并运算的字符 | ,以避免和文法中作为元符号使用的竖线相混淆:

rexpr -> rexpr + rterm | rterm
rterm -> rterm rfactor | rfactor
rfactor -> rfactor * | rprimary
rprimary -> a | b
  1. 对这个文法提取左公因子。
  2. 提取左公因子的变换能使这个文法适用于自顶向下的语法分析技术吗?
  3. 提取左公因子之后,从原文法中消除左递归。
  4. 得到的文法适用于自顶向下的语法分析吗?
解答
  1. 无左公因子

  2. 不适合

  3. 消除左递归

        rexpr -> rterm A
            A -> + rterm A | ε
        rterm -> rfactor B
            B -> rfactor B | ε
      rfactor -> rprimary C
            C -> * C | ε
     rprimary -> a | b
    
  4. 适合?

4.3.2

对下面的文法重复练习 4.3.1

  1. 练习 4.2.1 的文法
  2. 练习 4.2.2-1 的文法
  3. 练习 4.2.2-3 的文法
  4. 练习 4.2.2-5 的文法
  5. 练习 4.2.2-7 的文法
解答
  1. S -> S S + | S S * | a

    1. 提取左公因子

       S -> S S A | a
       A -> + | *
      
    2. 不适合

    3. 消除左递归

       // initial status
       1)S -> S S A | a
       2) A -> + | *
       
       // i = 1
       1) S -> a B
       2) B -> S A B | ε
       3) A -> + | *
       
       // i = 2, j = 1
       1) S -> a B
       2) B -> a B A B | ε
       3) A -> + | *
       
       // i = 3, j = 1 ~ 2
       // nothing changed
      
    4. 适合

  2. S -> 0 S 1 | 0 1

    1. 提取左公因子

       S -> 0 A
       A -> S 1 | 1
      
    2. 不适合,有间接左递归

    3. 消除左递归

       // initial status
       1) S -> 0 A
       2) A -> S 1 | 1
       
       // i = 1
       // nothing changed
       
       // i = 2, j = 1
       1) S -> 0 A
       2) A -> 0 A 1 | 1
      
    4. 合适

  3. S -> S (S) S | ε

    1. 无左公因子

    2. 不合适

    3. 消除左递归

       // initial status
       1) S -> S (S) S | ε
       
       // i = 1
       1) S -> A
       2) A -> (S) S A | ε
       
       // i = 2, j = 1
       // nothing changed
      
    4. 合适

  4. S -> (L) | a 以及 L -> L, S | S

    1. 无左公因子

    2. 不合适

    3. 消除左递归

       // initial status
       1) S -> (L) | a
       2) L -> L, S | S
       
       // i = 1
       // nothing changed
       
       // i = 2, j = 1
       1) S -> (L) | a
       2) L -> (L) A | a A
       3) A -> , S A | ε
       
       // i = 3, j = 1~2
       // nothing changed
      
    4. 合适

4.3.3 !

下面文法的目的是消除 4.3.2 节中讨论的 “悬空-else 二义性”:

stmt -> if expr then stmt
      | matchedStmt
matchedStmt -> if expr then matchedStmt else stmt
             | other

说明这个文法仍然是二义性的。

解答

看一段示范代码,我们通过缩进来表示代码解析的层次结构

if expr 
then 
    if expr 
    then matchedStmt 
    else
        if expr
        then matchedStmt
else stmt

这段代码还可以被解析成

if expr 
then 
    if expr 
    then matchedStmt 
    else
        if expr
        then matchedStmt
        else stmt

所以这仍然是一个二义性的文法。原因在于 matchedStmt -> if expr then matchedStmt else stmt 中的最后一个 stmt,如果包含 else 语句的话,既可以认为是属于这个 stmt 的,也可以认为是属于包含这个 matchedStmt 的语句的。

编译原理龙书答案 完整性高 第二章 2.2 Exercises for Section 2.2 2.2.1 Consider the context-free grammar: S -> S S + | S S * | a Show how the string aa+a* can be generated by this grammar. Construct a parse tree for this string. What language does this grammar generate? Justify your answer. answer S -> S S * -> S S + S * -> a S + S * -> a a + S * -> a a + a * L = {Postfix expression consisting of digits, plus and multiple signs} 2.2.2 What language is generated by the following grammars? In each case justify your answer. S -> 0 S 1 | 0 1 S -> + S S | - S S | a S -> S ( S ) S | ε S -> a S b S | b S a S | ε ⧗ S -> a | S + S | S S | S * | ( S ) answer L = {0n1n | n>=1} L = {Prefix expression consisting of plus and minus signs} L = {Matched brackets of arbitrary arrangement and nesting, includes ε} L = {String has the same amount of a and b, includes ε} ? 2.2.3 Which of the grammars in Exercise 2.2.2 are ambiguous answer No No Yes Yes Yes 2.2.4 Construct unambiguous context-free grammars for each of the following languages. In each case show that your grammar is correct. Arithmetic expressions in postfix notation. Left-associative lists of identifiers separated by commas. Right-associative lists of identifiers separated by commas. Arithmetic expressions of integers and identifiers with the four binary operators +, - , *, /. answer 1. E -> E E op | num 2. list -> list , id | id 3. list -> id , list | id 4. expr -> expr + term | expr - term | term term -> term * factor | term / factor | factor factor -> id | num | (expr) 5. expr -> expr + term | expr - term | term term -> term * unary | term / unary | unary unary -> + factor | - factor factor - > id | num | (expr) 2.2.5 Show that all binary strings generated by the following grammar have values divisible by 3. Hint. Use induction on the number of nodes in a parse tree. num -> 11 | 1001 | num 0 | num num Does the grammar generate all binary strings with values divisible by 3? answer prove any string derived from the grammar can be considered to be a sequence consisting of 11, 1001 and 0, and not prefixed with 0. the sum of this string is: sum = Σn (21 + 20) * 2 n + Σm (23 + 20) * 2m = Σn 3 * 2 n + Σm 9 * 2m It is obviously can divisible by 3. No. Consider string "10101", it is divisible by 3, but cannot derived from the grammar. Question: any general prove? 2.2.6 Construct a context-free grammar for roman numerals. Note: we just consider a subset of roman numerals which is less than 4k. answer wikipedia: Roman_numerals via wikipedia, we can categorize the single noman numerals into 4 groups: I, II, III | I V | V, V I, V II, V III | I X then get the production: digit -> smallDigit | I V | V smallDigit | I X smallDigit -> I | II | III | ε and we can find a simple way to map roman to arabic numerals. For example: XII => X, II => 10 + 2 => 12 CXCIX => C, XC, IX => 100 + 90 + 9 => 199 MDCCCLXXX => M, DCCC, LXXX => 1000 + 800 + 80 => 1880 via the upper two rules, we can derive the production: romanNum -> thousand hundred ten digit thousand -> M | MM | MMM | ε hundred -> smallHundred | C D | D smallHundred | C M smallHundred -> C | CC | CCC | ε ten -> smallTen | X L | L smallTen | X C smallTen -> X | XX | XXX | ε digit -> smallDigit | I V | V smallDigit | I X smallDigit -> I | II | III | ε 2.3 Exercises for Section 2.3 2.3.1 Construct a syntax-directed translation scheme that trans­ lates arithmetic expressions from infix notation into prefix notation in which an operator appears before its operands; e.g. , -xy is the prefix notation for x - y . Give annotated parse trees for the inputs 9-5+2 and 9-5*2.。 answer productions: expr -> expr + term | expr - term | term term -> term * factor | term / factor | factor factor -> digit | (expr) translation schemes: expr -> {print("+")} expr + term | {print("-")} expr - term | term term -> {print("*")} term * factor | {print("/")} term / factor | factor factor -> digit {print(digit)} | (expr) 2.3.2 Construct a syntax-directed translation scheme that trans­ lates arithmetic expressions from postfix notation into infix notation. Give annotated parse trees for the inputs 95-2* and 952*-. answer productions: expr -> expr expr + | expr expr - | expr expr * | expr expr / | digit translation schemes: expr -> expr {print("+")} expr + | expr {print("-")} expr - | {print("(")} expr {print(")*(")} expr {print(")")} * | {print("(")} expr {print(")/(")} expr {print(")")} / | digit {print(digit)} Another reference answer E -> {print("(")} E {print(op)} E {print(")"}} op | digit {print(digit)} 2.3.3 Construct a syntax-directed translation scheme that trans­ lates integers into roman numerals answer assistant function: repeat(sign, times) // repeat(&#39;a&#39;,2) = &#39;aa&#39; translation schemes: num -> thousand hundred ten digit { num.roman = thousand.roman || hundred.roman || ten.roman || digit.roman; print(num.roman)} thousand -> low {thousand.roman = repeat(&#39;M&#39;, low.v)} hundred -> low {hundred.roman = repeat(&#39;C&#39;, low.v)} | 4 {hundred.roman = &#39;CD&#39;} | high {hundred.roman = &#39;D&#39; || repeat(&#39;X&#39;, high.v - 5)} | 9 {hundred.roman = &#39;CM&#39;} ten -> low {ten.roman = repeat(&#39;X&#39;, low.v)} | 4 {ten.roman = &#39;XL&#39;} | high {ten.roman = &#39;L&#39; || repeat(&#39;X&#39;, high.v - 5)} | 9 {ten.roman = &#39;XC&#39;} digit -> low {digit.roman = repeat(&#39;I&#39;, low.v)} | 4 {digit.roman = &#39;IV&#39;} | high {digit.roman = &#39;V&#39; || repeat(&#39;I&#39;, high.v - 5)} | 9 {digit.roman = &#39;IX&#39;} low -> 0 {low.v = 0} | 1 {low.v = 1} | 2 {low.v = 2} | 3 {low.v = 3} high -> 5 {high.v = 5} | 6 {high.v = 6} | 7 {high.v = 7} | 8 {high.v = 8} 2.3.4 Construct a syntax-directed translation scheme that trans­ lates roman numerals into integers. answer productions: romanNum -> thousand hundred ten digit thousand -> M | MM | MMM | ε hundred -> smallHundred | C D | D smallHundred | C M smallHundred -> C | CC | CCC | ε ten -> smallTen | X L | L smallTen | X C smallTen -> X | XX | XXX | ε digit -> smallDigit | I V | V smallDigit | I X smallDigit -> I | II | III | ε translation schemes: romanNum -> thousand hundred ten digit {romanNum.v = thousand.v || hundred.v || ten.v || digit.v; print(romanNun.v)} thousand -> M {thousand.v = 1} | MM {thousand.v = 2} | MMM {thousand.v = 3} | ε {thousand.v = 0} hundred -> smallHundred {hundred.v = smallHundred.v} | C D {hundred.v = smallHundred.v} | D smallHundred {hundred.v = 5 + smallHundred.v} | C M {hundred.v = 9} smallHundred -> C {smallHundred.v = 1} | CC {smallHundred.v = 2} | CCC {smallHundred.v = 3} | ε {hundred.v = 0} ten -> smallTen {ten.v = smallTen.v} | X L {ten.v = 4} | L smallTen {ten.v = 5 + smallTen.v} | X C {ten.v = 9} smallTen -> X {smallTen.v = 1} | XX {smallTen.v = 2} | XXX {smallTen.v = 3} | ε {smallTen.v = 0} digit -> smallDigit {digit.v = smallDigit.v} | I V {digit.v = 4} | V smallDigit {digit.v = 5 + smallDigit.v} | I X {digit.v = 9} smallDigit -> I {smallDigit.v = 1} | II {smallDigit.v = 2} | III {smallDigit.v = 3} | ε {smallDigit.v = 0} 2.3.5 Construct a syntax-directed translation scheme that trans­ lates postfix arithmetic expressions into equivalent prefix arithmetic expressions. answer production: expr -> expr expr op | digit translation scheme: expr -> {print(op)} expr expr op | digit {print(digit)} Exercises for Section 2.4 2.4.1 Construct recursive-descent parsers, starting with the follow­ ing grammars: S -> + S S | - S S | a S -> S ( S ) S | ε S -> 0 S 1 | 0 1 Answer 1) S -> + S S | - S S | a void S(){ switch(lookahead){ case "+": match("+"); S(); S(); break; case "-": match("-"); S(); S(); break; case "a": match("a"); break; default: throw new SyntaxException(); } } void match(Terminal t){ if(lookahead = t){ lookahead = nextTerminal(); }else{ throw new SyntaxException() } } 2) S -> S ( S ) S | ε void S(){ if(lookahead == "("){ S(); match("("); S(); match(")"); S(); } } 3) S -> 0 S 1 | 0 1 void S(){ switch(lookahead){ case "0": match("0"); S(); match("1"); break; case "1": // match(epsilon); break; default: throw new SyntaxException(); } } Exercises for Section 2.6 2.6.1 Extend the lexical analyzer in Section 2.6.5 to remove com­ ments, defined as follows: A comment begins with // and includes all characters until the end of that line. A comment begins with /* and includes all characters through the next occurrence of the character sequence */. 2.6.2 Extend the lexical analyzer in Section 2.6.5 to recognize the relational operators <, =, >. 2.6.3 Extend the lexical analyzer in Section 2.6.5 to recognize float­ ing point numbers such as 2., 3.14, and . 5. Answer Source code: commit 8dd1a9a Code snippet(src/lexer/Lexer.java): public Token scan() throws IOException, SyntaxException{ for(;;peek = (char)stream.read()){ if(peek == &#39; &#39; || peek == &#39;\t&#39;){ continue; }else if(peek == &#39;\n&#39;){ line = line + 1; }else{ break; } } // handle comment if(peek == &#39;/&#39;){ peek = (char) stream.read(); if(peek == &#39;/&#39;){ // single line comment for(;;peek = (char)stream.read()){ if(peek == &#39;\n&#39;){ break; } } }else if(peek == &#39;*&#39;){ // block comment char prevPeek = &#39; &#39;; for(;;prevPeek = peek, peek = (char)stream.read()){ if(prevPeek == &#39;*&#39; && peek == &#39;/&#39;){ break; } } }else{ throw new SyntaxException(); } } // handle relation sign if("".indexOf(peek) > -1){ StringBuffer b = new StringBuffer(); b.append(peek); peek = (char)stream.read(); if(peek == &#39;=&#39;){ b.append(peek); } return new Rel(b.toString()); } // handle number, no type sensitive if(Character.isDigit(peek) || peek == &#39;.&#39;){ Boolean isDotExist = false; StringBuffer b = new StringBuffer(); do{ if(peek == &#39;.&#39;){ isDotExist = true; } b.append(peek); peek = (char)stream.read(); }while(isDotExist == true ? Character.isDigit(peek) : Character.isDigit(peek) || peek == &#39;.&#39;); return new Num(new Float(b.toString())); } // handle word if(Character.isLetter(peek)){ StringBuffer b = new StringBuffer(); do{ b.append(peek); peek = (char)stream.read(); }while(Character.isLetterOrDigit(peek)); String s = b.toString(); Word w = words.get(s); if(w == null){ w = new Word(Tag.ID, s); words.put(s, w); } return w; } Token t = new Token(peek); peek = &#39; &#39;; return t; } Exercises for Section 2.8 2.8.1 For-statements in C and Java have the form: for ( exprl ; expr2 ; expr3 ) stmt The first expression is executed before the loop; it is typically used for initializ­ ing the loop index. The second expression is a test made before each iteration of the loop; the loop is exited if the expression becomes O. The loop itself can be thought of as the statement {stmt expr3 ; }. The third expression is executed at the end of each iteration; it is typically used to increment the loop index. The meaning of the for-statement is similar to expr1 ; while ( expr2 ) {stmt expr3 ; } Define a class For for for-statements, similar to class If in Fig. 2.43. Answer class For extends Stmt{ Expr E1; Expr E2; Expr E3; Stmt S; public For(Expr expr1, Expr expr2, Expr expr3, Stmt stmt){ E1 = expr1; E2 = expr2; E3 = expr3; S = stmt; } public void gen(){ E1.gen(); Label start = new Lable(); Lalel end = new Lable(); emit("ifFalse " + E2.rvalue().toString() + " goto " + end); S.gen(); E3.gen(); emit("goto " + start); emit(end + ":") } } 2.8.2 The programming language C does not have a boolean type. Show how a C compiler might translate an if-statement into three-address code. Answer Replace emit("isFalse " + E.rvalue().toString() + " goto " + after); with emit("ifNotEqual " + E.rvalue().toString() + " 0 goto " + after); or emit("isNotEqualZero " + E.rvalue().toString() + " goto " + after);
编译原理》课后习题答案第一章 第 1 章引论 第 1 题 解释下列术语: (1)编译程序 (2)源程序 (3)目标程序 (4)编译程序的前端 (5)后端 (6)遍 答案: (1) 编译程序:如果源语言为高级语言,目标语言为某台计算机上的汇编语言或机器语 言,则此翻译程序称为编译程序。 (2) 源程序:源语言编写的程序称为源程序。 (3) 目标程序:目标语言书写的程序称为目标程序。 (4) 编译程序的前端:它由这样一些阶段组成:这些阶段的工作主要依赖于源语言而与 目标机无关。通常前端包括词法分析、语法分析、语义分析中间代码生成这些阶 段,某些优化工作也可在前端做,也包括与前端每个阶段相关的出错处理工作符 号表管理等工作。 (5) 后端:指那些依赖于目标机而一般不依赖源语言,只与中间代码有关的那些阶段, 即目标代码生成,以及相关出错处理符号表操作。 (6) 遍:是对源程序或其等价的中间语言程序从头到尾扫视并完成规定任务的过程。 第 2 题 一个典型的编译程序通常由哪些部分组成?各部分的主要功能是什么?并画出编译程 序的总体结构图。 答案一个典型的编译程序通常包含 8 个组成部分,它们是词法分析程序、语法分析程序、语 义分析程序、中间代码生成程序、中间代码优化程序、目标代码生成程序、表格管理程序 错误处理程序。其各部分的主要功能简述如下。 词法分析程序:输人源程序,拼单词、检查单词分析单词,输出单词的机内表达形式。 语法分析程序:检查源程序中存在的形式语法错误,输出错误处理信息。 语义分析程序:进行语义检查分析语义信息,并把分析的结果保存到各类语义信息表 中。 中间代码生成程序:按照语义规则,将语法分析程序分析出的语法单位转换成一定形式 的中间语言代码,如三元式或四元式。 中间代码优化程序:为了产生高质量的目标代码,对中间代码进行等价变换处理。 盛威网(www.snwei.com)专业的计算机学习网站1 《编译原理》课后习题答案第一章 目标代码生成程序:将优化后的中间代码程序转换成目标代码程序。 表格管理程序:负责建立、填写查找等一系列表格工作。表格的作用是记录源程序的 各类信息编译各阶段的进展情况,编译的每个阶段所需信息多数都从表格中读取,产生的 中间结果都记录在相应的表格中。可以说整个编译过程就是造表、查表的工作过程。需要指 出的是,这里的“表格管理程序”并不意味着它就是一个独立的表格管理模块,而是指编译 程序具有的表格管理功能。 错误处理程序:处理校正源程序中存在的词法、语法语义错误。当编译程序发现源 程序中的错误时,错误处理程序负责报告出错的位置错误性质等信息,同时对发现的错误 进行适当的校正(修复),目的是使编译程序能够继续向下进行分析处理。 注意:如果问编译程序有哪些主要构成成分,只要回答六部分就可以。如果搞不清楚, 就回答八部分。 第 3 题 何谓翻译程序、编译程序解释程序?它们三者之间有何种关系? 答案: 翻译程序是指将用某种语言编写的程序转换成另一种语言形式的程序的程序,如编译程 序汇编程序等。 编译程序是把用高级语言编写的源程序转换(加工)成与之等价的另一种用低级语言编 写的目标程序的翻译程序。 解释程序是解释、执行高级语言源程序的程序。解释方式一般分为两种:一种方式是, 源程序功能的实现完全由解释程序承担完成,即每读出源程序的一条语句的第一个单词, 则依据这个单词把控制转移到实现这条语句功能的程序部分,该部分负责完成这条语句的功 能的实现,完成后返回到解释程序的总控部分再读人下一条语句继续进行解释、执行,如此 反复;另一种方式是,一边翻译一边执行,即每读出源程序的一条语句,解释程序就将其翻 译成一段机器指令并执行之,然后再读人下一条语句继续进行解释、执行,如此反复。无论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值