use the token () method from the cs-tokenizer.cs , i can recognize how the csharp compiler to work.
follow this,that is the most important part of the driver :
static void tokenize_file (SourceFile file)
firstly ,the mcs used some expression to replace the general parser
which is got from book.(like as " if ( is_identifier || is_identifier_numeric){...} ");
Each time a token is returned, the location for the token is
recorded into the `Location' property, that can be accessed by
the parser. The parser retrieves the Location properties as
it builds its internal representation to allow the semantic
analysis phase to produce error messages that can pin point
the location of the problem.
Some tokens have values associated with it, for example when
the tokenizer encounters a string, it will return a
LITERAL_STRING token, and the actual string parsed will be
available in the `Value' property of the tokenizer. The same
mechanism is used to return integers and floating point
numbers.
//
//i can not understand that why design the location.
//
** Locations
Locations are encoded as a 32-bit number (the Location
struct) that map each input source line to a linear number.
As new files are parsed, the Location manager is informed of
the new file, to allow it to map back from an int constant to
a file + line number.
Prior to parsing/tokenizing any source files, the compiler
generates a list of all the source files and then reserves the
low N bits of the location to hold the source file, where N is
large enough to hold at least twice as many source files as were
specified on the command line (to allow for a #line in each file).
The upper 32-N bits are the line number in that file.
The token 0 is reserved for ``anonymous'' locations, ie. if we
don't know the location (Location.Null).
The tokenizer also tracks the column number for a token, but
this is currently not being used or encoded. It could
probably be encoded in the low 9 bits, allowing for columns
from 1 to 512 to be encoded.
after read mono
最新推荐文章于 2022-08-09 19:13:25 发布
本文介绍了使用cs - tokenizer.cs中的token()方法来了解C#编译器的工作原理。阐述了tokenize_file函数,编译器用表达式替换通用解析器,记录token位置用于语义分析报错。还说明了Locations的编码方式,以及tokenizer对字符串、数字等的处理。
436

被折叠的 条评论
为什么被折叠?



