题目:词法分析器编译预处理程序
目的:通过完成词法分析程序,了解词法分析的过程
实验内容:用java实现对词法分析器编译预处理程序子程集程序设计语言的语法识别程序
实验要求:将该语言的源程序,也就是相应字符流程序转换分析
词法分析器编译预处理设计思想:为了实现的便宜程序实用,这里规定源程序可采用自由书写格式,即一行内可以书写多个语句,一个语句可以占领多行书写;表示符的前20个字符有效;整数用2个字节表示;长整数用4个字节表示。这样的词法分析器的主要工作内容是(由于时间太仓促,解吸语句自己在程序中更改):
(1) 从源程序文件中读入字符。
(2) 统计行数和列数用于错误单词的定位。
(3) 删除空格类字符,包括回车、制表符空格。
(4) 按拼写单词,并用(内码,属性)二元式表示
(5) 根据需要是否填写表识符表供以后各阶段使用
这里采用的编译程序的实现是一遍扫描,即从左到右只扫描依次源程序,也就是词法分析作为语法分析的一个子程序。故在编写词法分析程序时,用重复调用词法分析子程序取一单词的方法得到整个源程序得到整个程序的内码流。扫描程序流程图
流程图:待传
实验结果:待传
总结:本程序特殊符号:'/n' ',' ' ' '/t' '{' '}' '(' ')' ';' '=' '+' '-' '*' '/'
关键字: int real for while do begin end if then AND OR NOT repeat until read write return true false boolean program const to
字母以及数字进行词法分析,对源代码进行编译预处理将注析部分消除,程序成功运行
不足之处:匆促只中难免有些不足之处,不能将代码进行词法的分析的记录进行保存, 在编译预处理部分只能对“//”进行处理 未能对“/* */” 进行处理,以及对编译部分在处理“;”符号时候是将程序中;符号前打一空格。
通过本次学习,自己学习到如何编写java Applictaion程序,了解了词法分析的过程
附录代码:
import java.awt.*;
import java.awt.event.*;
import java.io.*;
import java.awt.event.*;
import java.io.*;
public class Compiler extends Frame implements ActionListener{
int row = 1;
int line = 1;
int row = 1;
int line = 1;
MenuBar mb = new MenuBar();
Menu fileMenu = new Menu("File");
Menu actionMenu = new Menu("Project");
MenuItem closeWindow = new MenuItem("Exit");
MenuItem openFile = new MenuItem("Open file");
MenuItem lexical_check = new MenuItem("Check lex");
Menu fileMenu = new Menu("File");
Menu actionMenu = new Menu("Project");
MenuItem closeWindow = new MenuItem("Exit");
MenuItem openFile = new MenuItem("Open file");
MenuItem lexical_check = new MenuItem("Check lex");
int begin = 0;
int end = 0;
TextArea text = new TextArea(25,60);
TextArea error_text = new TextArea(10,60);
FileDialog file_dialog_load = new FileDialog(this, "Open file...", FileDialog.LOAD);
int end = 0;
TextArea text = new TextArea(25,60);
TextArea error_text = new TextArea(10,60);
FileDialog file_dialog_load = new FileDialog(this, "Open file...", FileDialog.LOAD);
Compiler(){
this.setLayout(new FlowLayout());
this.setLayout(new FlowLayout());
this.setMenuBar(mb);
mb.add(fileMenu);
mb.add(actionMenu);
fileMenu.add(openFile);
fileMenu.add(closeWindow);
fileMenu.add(closeWindow);
actionMenu.add(lexical_check);
this.add(text);
this.add(error_text);
error_text.setText("Lexical Information: /n");
this.add(error_text);
error_text.setText("Lexical Information: /n");
closeWindow.addActionListener(this);
openFile.addActionListener(this);
lexical_check.addActionListener(this);
this.setSize(500, 600);
openFile.addActionListener(this);
lexical_check.addActionListener(this);
this.setSize(500, 600);
this.addWindowListener(new WindowAdapter(){
public void windowClosing(WindowEvent e){
System.exit(0);
}
});
this.setVisible(true);
}
public void windowClosing(WindowEvent e){
System.exit(0);
}
});
this.setVisible(true);
}
public static void main(String[] args) {
Compiler compiler = new Compiler();
}
}
public void actionPerformed(ActionEvent e){
if(e.getSource() == closeWindow){
System.exit(0);
}else if(e.getSource() == openFile){
file_dialog_load.setVisible(true);
File myfile = new File(file_dialog_load.getDirectory(), file_dialog_load.getFile());
try{
BufferedReader bufReader = new BufferedReader(new FileReader(myfile));
String content = "";
String str;
String str;
while((str = bufReader.readLine()) != null){
content += str + "/n";
text.setText(content);
}
}catch(IOException ie){
System.out.println("IOexception occurs...");
}
}else if(e.getSource() == lexical_check){
error_text.setText("");
row = 0;
line = 1;
checkLexical();
}
}
public void checkLexical(){
String error_info = error_text.getText();
String content = text.getText();
String content1 ="";
// System.out.println(content);
//BufferedReader in=new BufferedReader(new InputStreamReader(System.in));
// char sfile;
if(content.equals("")){
error_info += "Text is empty! You havn't input any code!/n";
error_text.setText(error_info);
}
else{
int i = 0;
int N = content.length();
int state = 0;
System.out.println(end);
for(i = 0; i < N; i++){
row++;
String error_info = error_text.getText();
String content = text.getText();
String content1 ="";
// System.out.println(content);
//BufferedReader in=new BufferedReader(new InputStreamReader(System.in));
// char sfile;
if(content.equals("")){
error_info += "Text is empty! You havn't input any code!/n";
error_text.setText(error_info);
}
else{
int i = 0;
int N = content.length();
int state = 0;
System.out.println(end);
for(i = 0; i < N; i++){
row++;
char code;
char c = content.charAt(i);
code=c;
System.out.print(code);
// sfile=in.toString().charAt(i);
// System.out.print(sfile);
switch(state){
case 0:
if(c == ',' || c == ' ' || c == '/t' || c == '{' || c =='}' || c == '(' || c == ')' || c == ';' || c == '[' || c == ']')
{
//System.out.println(content.charAt(i - 1)+end+"row "+row+"line"+line);
if(isDigit(content.charAt(i - 1)) && isDigit(content.charAt(begin))){
end = i;
error_text.append("info:0 " + content.substring(begin, end) + '/n');
}
state = 0;
}
else if(c == '+') state = 1;
else if(c == '-') state = 2;
else if(c == '*') state = 3;
else if(c == '/') state = 4;
else if(c == '!') state = 5;
else if(c == '>') state = 6;
else if(c == '<') state = 7;
else if(c == '=') state = 8;
else if(((int)c) == 10) state = 9;
//else if(String.valueOf(c) == null) state = 9;
else if(isLetter(c)) {
state = 10;
begin = i;
}
//isDigit(int)
else if(isDigit(c)) {
begin = i;
state = 11;
}
else if(c == '#') state = 12;
else if(c == '&') state = 14;
else if(c == '|') state = 15;
else if(c == '"') state = 16;
else error_text.appendText("line: " + line + " row: " + row + " error: '" + c + "' Undefined character! /n");
break;
case 1://־ +
//row++;
if(c == '+'){
state = 0;
error_text.appendText("info + :1 '++'/n");
}
else if(c == '='){
state = 0;
error_text.appendText("info + :1 '+='/n");
char c = content.charAt(i);
code=c;
System.out.print(code);
// sfile=in.toString().charAt(i);
// System.out.print(sfile);
switch(state){
case 0:
if(c == ',' || c == ' ' || c == '/t' || c == '{' || c =='}' || c == '(' || c == ')' || c == ';' || c == '[' || c == ']')
{
//System.out.println(content.charAt(i - 1)+end+"row "+row+"line"+line);
if(isDigit(content.charAt(i - 1)) && isDigit(content.charAt(begin))){
end = i;
error_text.append("info:0 " + content.substring(begin, end) + '/n');
}
state = 0;
}
else if(c == '+') state = 1;
else if(c == '-') state = 2;
else if(c == '*') state = 3;
else if(c == '/') state = 4;
else if(c == '!') state = 5;
else if(c == '>') state = 6;
else if(c == '<') state = 7;
else if(c == '=') state = 8;
else if(((int)c) == 10) state = 9;
//else if(String.valueOf(c) == null) state = 9;
else if(isLetter(c)) {
state = 10;
begin = i;
}
//isDigit(int)
else if(isDigit(c)) {
begin = i;
state = 11;
}
else if(c == '#') state = 12;
else if(c == '&') state = 14;
else if(c == '|') state = 15;
else if(c == '"') state = 16;
else error_text.appendText("line: " + line + " row: " + row + " error: '" + c + "' Undefined character! /n");
break;
case 1://־ +
//row++;
if(c == '+'){
state = 0;
error_text.appendText("info + :1 '++'/n");
}
else if(c == '='){
state = 0;
error_text.appendText("info + :1 '+='/n");
}else{
state = 0;
error_text.appendText("info + :1 '+'/n");
i--;
row--;
}
break;
case 2://־ -
if(c == '-')
error_text.appendText("info - :2 '--'/n");
else if(c == '=')
error_text.appendText("info - :2 '-='/n");
else{
error_text.appendText("info - :2 '-'/n");
i--;
row--;
}
state = 0;
break;
case 3://־ *
if(c == '=')
error_text.appendText("info * :3 '*='/n");
else{
error_text.appendText("info * :3 '*'/n");
i--;
row--;
}
state = 0;
break;
case 4://־ /
if(c == '/'){
if((c=content.charAt(i+1))=='*')
state=3;
else{//code=' ';
// System.out.print(content.charAt(i+1));
// System.out.print(" "+code);
while((c) != '/n'){
c=content.charAt(i);
i++;
}
// System.out.println("");
System.out.println("");
c = content.charAt(i);
state = 0;
error_text.appendText("info / :4 // /n");
}
}else if(c == '='){
state = 0;
error_text.appendText("info / :4 /= /n");
}else{
state = 0;
error_text.appendText("info / :4 / /n");
i--;
row--;
}
//state = 0;
break;
case 5://־ !
if(c == '='){
error_text.appendText("info ! :5 != /n");
state = 0;
}else{
state = 0;
i--;
row--;
error_text.appendText("info ! :5 ! /n");
}
//state = 0;
break;
case 6://־ >
if(c == '='){
error_text.appendText("info > :6 >= /n");
state = 0;
}else{
state = 0;
error_text.appendText("info > :6 > /n");
}
//state = 0;
break;
case 7://־ <
if(c == '='){
error_text.appendText("info < :7 <= /n");
state = 0;
}else{
state = 0;
error_text.appendText("info < :7 < /n");
}
break;
case 8://־ =
if(c == '='){
error_text.appendText("info = :8 == /n");
state = 0;
}else{
state = 0;
error_text.appendText("info = :8 = /n");
}
break;
case 9://־
state = 0;
row = 1;
line --;
//System.out.println("");
error_text.appendText("info - :9 /n");
break;
case 10:// ־
if(isLetter(c) || isDigit(c)){
state = 10;
}else{
end = i;
String id = content.substring(begin, end);
if(isKey(id))
error_text.appendText("info - :10 "+id+'/n');
//error_text.appendText("info ־ : 10" + id + '/n');
else
error_text.appendText("info - :10 " + id + '/n');
i--;
row--;
state = 0;
}
//state = 0;
break;
case 11:// ־
if(c == 'e' || c == 'E')
state = 13;
else if(isDigit(c) || c == '.'){
}else {
if(isLetter(c)){
error_text.appendText("error: line " + line + " row " + row + " ָʽ߱־/n");
}
//i--;
//row--;
int temp = i;
i = find(i,content);
row += (i - temp);
state = 0;
}
state = 0;
error_text.appendText("info + :1 '+'/n");
i--;
row--;
}
break;
case 2://־ -
if(c == '-')
error_text.appendText("info - :2 '--'/n");
else if(c == '=')
error_text.appendText("info - :2 '-='/n");
else{
error_text.appendText("info - :2 '-'/n");
i--;
row--;
}
state = 0;
break;
case 3://־ *
if(c == '=')
error_text.appendText("info * :3 '*='/n");
else{
error_text.appendText("info * :3 '*'/n");
i--;
row--;
}
state = 0;
break;
case 4://־ /
if(c == '/'){
if((c=content.charAt(i+1))=='*')
state=3;
else{//code=' ';
// System.out.print(content.charAt(i+1));
// System.out.print(" "+code);
while((c) != '/n'){
c=content.charAt(i);
i++;
}
// System.out.println("");
System.out.println("");
c = content.charAt(i);
state = 0;
error_text.appendText("info / :4 // /n");
}
}else if(c == '='){
state = 0;
error_text.appendText("info / :4 /= /n");
}else{
state = 0;
error_text.appendText("info / :4 / /n");
i--;
row--;
}
//state = 0;
break;
case 5://־ !
if(c == '='){
error_text.appendText("info ! :5 != /n");
state = 0;
}else{
state = 0;
i--;
row--;
error_text.appendText("info ! :5 ! /n");
}
//state = 0;
break;
case 6://־ >
if(c == '='){
error_text.appendText("info > :6 >= /n");
state = 0;
}else{
state = 0;
error_text.appendText("info > :6 > /n");
}
//state = 0;
break;
case 7://־ <
if(c == '='){
error_text.appendText("info < :7 <= /n");
state = 0;
}else{
state = 0;
error_text.appendText("info < :7 < /n");
}
break;
case 8://־ =
if(c == '='){
error_text.appendText("info = :8 == /n");
state = 0;
}else{
state = 0;
error_text.appendText("info = :8 = /n");
}
break;
case 9://־
state = 0;
row = 1;
line --;
//System.out.println("");
error_text.appendText("info - :9 /n");
break;
case 10:// ־
if(isLetter(c) || isDigit(c)){
state = 10;
}else{
end = i;
String id = content.substring(begin, end);
if(isKey(id))
error_text.appendText("info - :10 "+id+'/n');
//error_text.appendText("info ־ : 10" + id + '/n');
else
error_text.appendText("info - :10 " + id + '/n');
i--;
row--;
state = 0;
}
//state = 0;
break;
case 11:// ־
if(c == 'e' || c == 'E')
state = 13;
else if(isDigit(c) || c == '.'){
}else {
if(isLetter(c)){
error_text.appendText("error: line " + line + " row " + row + " ָʽ߱־/n");
}
//i--;
//row--;
int temp = i;
i = find(i,content);
row += (i - temp);
state = 0;
}
break;
case 12://־ #
String id = "";
while(c != '<'){
id += c;
i++;
c = content.charAt(i);
System.out.print(c);
}
if(id.trim().equals("include")){
while(c != '>' && ( c != '/n')){
i++;
c = content.charAt(i);
System.out.print(c);
}
if(c == '>')
error_text.append("info # :12 /n");
}else
error_text.appendText("error: " + "line " + line + ", row " + row + " ?/n");
//i--;
//row--;
case 12://־ #
String id = "";
while(c != '<'){
id += c;
i++;
c = content.charAt(i);
System.out.print(c);
}
if(id.trim().equals("include")){
while(c != '>' && ( c != '/n')){
i++;
c = content.charAt(i);
System.out.print(c);
}
if(c == '>')
error_text.append("info # :12 /n");
}else
error_text.appendText("error: " + "line " + line + ", row " + row + " ?/n");
//i--;
//row--;
state = 0;
break;
case 13:
if(c == '+' || c == '-' || isDigit(c)){
i++;
c = content.charAt(i);
while(isDigit(c)){
i++;
c = content.charAt(i);
}
if(isLetter(c) || c == '.'){
error_text.appendText("error line " + line + " row " + row + " /n");
state = 0;
//i--;
//row--;
int temp = i;
i = find(i,content);
row += (i - temp);
//error_text.appendText("i = " + i + " row = " + row + " len = " + content.length() + '/n');
}else{
end = i;
error_text.appendText("info ex :13 " + content.substring(begin, end) + '/n');
}
/*
end = i;
String str = content.substring(begin, end);
error_text.appendText("info: ָ " + str + '/n');
*/
state = 0;
}
break;
case 14://&&
if(c == '&')
error_text.appendText("info '&' :14 /n");
else{
i--;
error_text.appendText("info '&&' :14. /n");
}
state = 0;
break;
case 15://||
if(c == '|')
error_text.appendText("info '||' :15 /n");
else{
i--;
error_text.appendText("info: '|' :15 /n");
}
state = 0;
break;
case 16:
error_text.appendText("info :15 "+ '"' + '/n');
i--;
state = 0;
break;
break;
case 13:
if(c == '+' || c == '-' || isDigit(c)){
i++;
c = content.charAt(i);
while(isDigit(c)){
i++;
c = content.charAt(i);
}
if(isLetter(c) || c == '.'){
error_text.appendText("error line " + line + " row " + row + " /n");
state = 0;
//i--;
//row--;
int temp = i;
i = find(i,content);
row += (i - temp);
//error_text.appendText("i = " + i + " row = " + row + " len = " + content.length() + '/n');
}else{
end = i;
error_text.appendText("info ex :13 " + content.substring(begin, end) + '/n');
}
/*
end = i;
String str = content.substring(begin, end);
error_text.appendText("info: ָ " + str + '/n');
*/
state = 0;
}
break;
case 14://&&
if(c == '&')
error_text.appendText("info '&' :14 /n");
else{
i--;
error_text.appendText("info '&&' :14. /n");
}
state = 0;
break;
case 15://||
if(c == '|')
error_text.appendText("info '||' :15 /n");
else{
i--;
error_text.appendText("info: '|' :15 /n");
}
state = 0;
break;
case 16:
error_text.appendText("info :15 "+ '"' + '/n');
i--;
state = 0;
break;
}
// content1 +=c;
}
// System.out.println(content1);
}
error_text.appendText("Have checked lexical! /n");
// content1 +=c;
}
// System.out.println(content1);
}
error_text.appendText("Have checked lexical! /n");
}
boolean isLetter(char c){
if((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || c == '_')
return true;
return false;
}
boolean isDigit(char c){
if(c >= '0' && c <= '9') return true;
return false;
}
boolean isKey(String str){
if(str.equals("int") || str.equals("real") || str.equals("for") || str.equals("while") || str.equals("do") || str.equals("begin") || str.equals("end") || str.equals("if") || str.equals("then")||
str.equals("AND") || str.equals("OR") || str.equals("NOT") || str.equals("repeat") || str.equals("until") || str.equals("read") || str.equals("write") || str.equals("return") || str.equals("true")
|| str.equals("false") || str.equals("boolean") || str.equals("program") || str.equals("const") || str.equals("to"))
return true;
return false;
}
int find(int begin, String str){
if(begin >= str.length())
return str.length();
for(int i = begin; i < str.length(); i++){
char c = str.charAt(i);
if(c == '/n' || c == ',' || c == ' ' || c == '/t' || c == '{' || c =='}' || c == '(' || c == ')' || c == ';' || c == '=' || c == '+'|| c == '-' || c == '*' || c == '/')
return i - 1;
}
return str.length();
}
}
程序验证代码:
import java.io.File;
public class Exam1
{
public static void main (String args[])
{
File f1=new File("F://Prog//File");
File f2=new File("F://Prog//File","inputoutput.txt");
File f3=new File(f1,"inputoutput.txt");
//文件处世化
boolean b1=f1.exists();
boolean b2=f2.exists();
boolean b3=f3.exists();
System.out.println("f1"+b1+" "+"f2"+b2+" "+"f3"+b3);
}
}
{
public static void main (String args[])
{
File f1=new File("F://Prog//File");
File f2=new File("F://Prog//File","inputoutput.txt");
File f3=new File(f1,"inputoutput.txt");
//文件处世化
boolean b1=f1.exists();
boolean b2=f2.exists();
boolean b3=f3.exists();
System.out.println("f1"+b1+" "+"f2"+b2+" "+"f3"+b3);
}
}
如何实现这个词法分析器:
第一步:写出正规式
变量的正规式是:letter(letter|digit)*
操作符的正规式:&,|,~,(,)。由于操作符都是固定字符的,所以正规式就是它本身。
第二步:根据正规式构造NFA
正规式构造NFA的基础是先构造正规式中每个字符的NFA,如变量的正规式中只有两个字符 letter 和 digit,他们的正规式分别是:
根据构造|形式的NFA的公式可以构造 letter|digit 的NFA如下图:
(letter|digit)*的NFA为:
letter(letter|digit)*的NFA为:
第三步:将NFA转换为DFA
先是将所有通过ε可以到达的状态合并,由上图NFA可以看出1,2,3,4,6,9都是通过ε可以直接用箭头连接的,以此类推5,8,9,3,4,6和7,8,9,3,4,6都是可以合并称一个状态的,如下图:
此图用新的DFA表示如下:
由于生成的DFA的状态过多,需要将上面的DFA最小化,生成状态数最少的DFA,最小化DFA的过程就是将那些无论怎样转换都仍然转换到本组合内部的状态组合合并,如上图{B,C,D}这三个状态组合称状态组无论经过letter还是digit转换都仍然转换到该组合,那么就可以将这三个状态合并,如下图:
用新状态表示为:
脚本中变量的DFA已经构造好了。
由于操作符都是固定字符,所以DFA比较简单,下面给出几个的图示例子:
现在我们将各个单词的DFA组合在一起,大致类似下图:
实际上,由于这种很简单的例子不必套用这种正规的步骤来得到DFA,可以直接把图画出来。首先画一个开始状态(Start)和一个结束状态(Done),然后再考虑各个单词之间的关系将其分类,这种分类的过程就看你的经验了,依据各个单词挨个字符被识别的过程即可画出类似上图的DFA,然后再根据这个DFA 写出词法分析的程序来就非常的简单了。
第一步:写出正规式
变量的正规式是:letter(letter|digit)*
操作符的正规式:&,|,~,(,)。由于操作符都是固定字符的,所以正规式就是它本身。
第二步:根据正规式构造NFA
正规式构造NFA的基础是先构造正规式中每个字符的NFA,如变量的正规式中只有两个字符 letter 和 digit,他们的正规式分别是:
根据构造|形式的NFA的公式可以构造 letter|digit 的NFA如下图:
(letter|digit)*的NFA为:
letter(letter|digit)*的NFA为:
第三步:将NFA转换为DFA
先是将所有通过ε可以到达的状态合并,由上图NFA可以看出1,2,3,4,6,9都是通过ε可以直接用箭头连接的,以此类推5,8,9,3,4,6和7,8,9,3,4,6都是可以合并称一个状态的,如下图:
此图用新的DFA表示如下:
由于生成的DFA的状态过多,需要将上面的DFA最小化,生成状态数最少的DFA,最小化DFA的过程就是将那些无论怎样转换都仍然转换到本组合内部的状态组合合并,如上图{B,C,D}这三个状态组合称状态组无论经过letter还是digit转换都仍然转换到该组合,那么就可以将这三个状态合并,如下图:
用新状态表示为:
脚本中变量的DFA已经构造好了。
由于操作符都是固定字符,所以DFA比较简单,下面给出几个的图示例子:
现在我们将各个单词的DFA组合在一起,大致类似下图:
实际上,由于这种很简单的例子不必套用这种正规的步骤来得到DFA,可以直接把图画出来。首先画一个开始状态(Start)和一个结束状态(Done),然后再考虑各个单词之间的关系将其分类,这种分类的过程就看你的经验了,依据各个单词挨个字符被识别的过程即可画出类似上图的DFA,然后再根据这个DFA 写出词法分析的程序来就非常的简单了。