after read mono

本文介绍了使用cs - tokenizer.cs中的token()方法来了解C#编译器的工作原理。阐述了tokenize_file函数,编译器用表达式替换通用解析器,记录token位置用于语义分析报错。还说明了Locations的编码方式,以及tokenizer对字符串、数字等的处理。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

use the token () method from the cs-tokenizer.cs , i can recognize how the csharp compiler to work. follow this,that is the most important part of the driver : static void tokenize_file (SourceFile file) firstly ,the mcs used some expression to replace the general parser which is got from book.(like as " if ( is_identifier || is_identifier_numeric){...} "); Each time a token is returned, the location for the token is recorded into the `Location' property, that can be accessed by the parser. The parser retrieves the Location properties as it builds its internal representation to allow the semantic analysis phase to produce error messages that can pin point the location of the problem. Some tokens have values associated with it, for example when the tokenizer encounters a string, it will return a LITERAL_STRING token, and the actual string parsed will be available in the `Value' property of the tokenizer. The same mechanism is used to return integers and floating point numbers. // //i can not understand that why design the location. // ** Locations Locations are encoded as a 32-bit number (the Location struct) that map each input source line to a linear number. As new files are parsed, the Location manager is informed of the new file, to allow it to map back from an int constant to a file + line number. Prior to parsing/tokenizing any source files, the compiler generates a list of all the source files and then reserves the low N bits of the location to hold the source file, where N is large enough to hold at least twice as many source files as were specified on the command line (to allow for a #line in each file). The upper 32-N bits are the line number in that file. The token 0 is reserved for ``anonymous'' locations, ie. if we don't know the location (Location.Null). The tokenizer also tracks the column number for a token, but this is currently not being used or encoded. It could probably be encoded in the low 9 bits, allowing for columns from 1 to 512 to be encoded.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值