这段时间一直在反思教育问题,把自己以前的书翻出来好好读,发现了许多不明白,未曾真懂得东西。
刚刚看完了词法分析和语法分析,越看越简单,不知道以前怎么会觉得它这么难。总之以前还是缺少实践。
下面来谈谈我在做lex和yacc遇到的一个例子。
lex与yacc(第二版)原书第一章有个实例源码是这样的。
ch1-05.l
%{
/*
* We now build a lexical analyzer to be used by a higher-level parser.
*/
#include "ch1-05y.h" /* token codes from the parser */
#define LOOKUP 0 /* default - not a defined word type. */
int state;
%}
%%
/n { state = LOOKUP; }
/./n { state = LOOKUP;
return 0; /* end of sentence */
}
^verb { state = VERB; }
^adj { state = ADJECTIVE; }
^adv { state = ADVERB; }
^noun { state = NOUN; }
^prep { state = PREPOSITION; }
^pron { state = PRONOUN; }
^conj { state = CONJUNCTION; }
[a-zA-Z]+ {
if(state != LOOKUP) {
add_word(state, yytext);
} else {
switch(lookup_word(yytext)) {
case VERB:
return(VERB);
case ADJECTIVE:
return(ADJECTIVE);
case ADVERB:
return(ADVERB);
case NOUN:
return(NOUN);
case PREPOSITION:
return(PREPOSITION);
case PRONOUN:
return(PRONOUN);
case CONJUNCTION:
return(CONJUNCTION);
default:
printf("%s: don't recognize/n", yytext);
/* don't return, just ignore it */
}
}
}
. ;
%%
/* define a linked list of words and types */
struct word {
char *word_name;
int word_type;
struct word *next;
};
struct word *word_list; /* first element in word list */
extern void *malloc();
int
add_word(int type, char *word)
{
struct word *wp;
if(lookup_word(word) != LOOKUP) {
printf("!!! warning: word %s already defined /n", word);
return 0;
}
/* word not there, allocate a new entry and link it on the list */
wp = (struct word *) malloc(sizeof(struct word));
wp->next = word_list;
/* have to copy the word itself as well */
wp->word_name = (char *) malloc(strlen(word)+1);
strcpy(wp->word_name, word);
wp->word_type = type;
word_list = wp;
return 1; /* it worked */
}
int
lookup_word(char *word)
{
struct word *wp = word_list;
/* search down the list looking for the word */
for(; wp; wp = wp->next) {
if(strcmp(wp->word_name, word) == 0)
return wp->word_type;
}
return LOOKUP; /* not found */
}
ch1-05.y
%{
/*
* A lexer for the basic grammar to use for recognizing english sentences.
*/
#include <stdio.h>
%}
%token NOUN PRONOUN VERB ADVERB ADJECTIVE PREPOSITION CONJUNCTION
%%
sentence: subject VERB object { printf("Sentence is valid./n"); }
;
subject: NOUN
| PRONOUN
;
object: NOUN
;
%%
extern FILE *yyin;
main()
{
while(!feof(yyin)) {
yyparse();
}
}
yyerror(s)
char *s;
{
fprintf(stderr, "%s/n", s);
}
ch1-05.y是yacc程序,yyparse()例程表示开始语法分析,根据编译原理所学,yyparse会调用lex的yylex,
执行yylex后,符合词法规则,则执行相应动作返回给yyparse。
编译运行后(在flex和bison下,使用cygwin模拟环境)
flex ch1-05.l
mv lex.xx.c ch1-05.c
bison -d ch1-05.y
gcc -g -DYYDEBUG -c -o ch1-05l.o ch1-05l.c
gcc -g -DYYDEBUG -c -o ch1-05y.o ch1-05y.c
gcc -g -o ch1-05.pgm ch1-05l.o ch1-05y.o -lfl
其中lfl为flex库。
(1)第一次运行./ch1-05.pgm,报告segment fault。使用gdb调试,在while(!feof(yyin))处发生,通过
对ch1-05l.c和ch1-05y.c源代码的查看加上猜测,估计此事yyin还没有值,应该是在yyparse后才会有值。
于是ch1-05y.c的main代码改为
main()
{
do {
yyparse();
}while(!feof(yyin))
}
(2) 第二次运行./ch1-05.pgm,输入如下:
verb is are am
noun i he pig
he is student
报告sentence is valid
再次输入
he is student
报告
syntax error
为什么呢?
想了一下,回过头看了下语法分析LALR和ch1-05.l分析的过程
发现yyparse调用后,经过了多次yylex的调用,已经把
he is student规约为sentence,但是由于缺少句号,一次yyparse调用并没有结束,再输入第二个he is student,即
栈内为sentence,输入为he is student,无法再规约,报告语法错误。
(3)第三次运行./ch1-05.pgm,输入如下:
verb is are am
noun i he pig
he is student
报告sentence is valid
再次输入
. 此处输入后yylex返回0,则本次yyparse调用结束。
he is student.
报告
sentence is valid.
上面就是基本的词法与语法分析的一个例子,应该对大家会有点启发。