Given a C++ program, remove comments from it. The program source is an array where source[i] is the i-th line of the source code. This represents the result of splitting the original source code string by the newline character \n.
In C++, there are two types of comments, line comments, and block comments.
The string // denotes a line comment, which represents that it and rest of the characters to the right of it in the same line should be ignored.
The string /* denotes a block comment, which represents that all characters until the next (non-overlapping) occurrence of */ should be ignored. (Here, occurrences happen in reading order: line by line from left to right.) To be clear, the string /*/ does not yet end the block comment, as the ending would be overlapping the beginning.
The first effective comment takes precedence over others: if the string // occurs in a block comment, it is ignored. Similarly, if the string /* occurs in a line or block comment, it is also ignored.
If a certain line of code is empty after removing comments, you must not output that line: each string in the answer list will be non-empty.
There will be no control characters, single quote, or double quote characters. For example, source = "string s = "/* Not a comment. */";" will not be a test case. (Also, nothing else such as defines or macros will interfere with the comments.)
It is guaranteed that every open block comment will eventually be closed, so /*outside of a line or block comment always starts a new comment.
Finally, implicit newline characters can be deleted by block comments. Please see the examples below for details.
After removing the comments from the source code, return the source code in the same format.
Example 1:
Input:
source = ["/*Test program */", "int main()", "{ ", " // variable declaration ", "int a, b, c;", "/* This is a test", " multiline ", " comment for ", " testing */", "a = b + c;", "}"]
The line by line code is visualized as below:
/*Test program */
int main()
{
// variable declaration
int a, b, c;
/* This is a test
multiline
comment for
testing */
a = b + c;
}
Output: ["int main()","{ "," ","int a, b, c;","a = b + c;","}"]
The line by line code is visualized as below:
int main()
{
int a, b, c;
a = b + c;
}
Explanation:
The string /* denotes a block comment, including line 1 and lines 6-9. The string // denotes line 4 as comments.
Example 2:
Input: source = ["a/*comment", "line", "more_comment*/b"] Output: ["ab"] Explanation: The original source string is "a/*comment\nline\nmore_comment*/b", where we have bolded the newline characters. After deletion, the implicit newline characters are deleted, leaving the string "ab", which when delimited by newline characters becomes ["ab"].
Note:
- The length of
sourceis in the range[1, 100]. - The length of
source[i]is in the range[0, 80]. - Every open block comment is eventually closed.
- There are no single-quote, double-quote, or control characters in the source code.
-----------
经典老题目了,难点是状态机的跳转不容易搞对。核心是要把每个字符的处理都作为一种状态,同时弄个buffer来缓存多行剩下的字符(要在每行结束的时候考虑要不要清buffer),正确的代码是:
from typing import List
from collections import defaultdict
class Solution:
def removeComments(self, source: List[str]) -> List[str]:
buffer, res = [], []
in_block = False
for line in source:
i, l = 0, len(line)
while (i<l):
if in_block == False and line[i:i + 2] == '//':
i += 2
break
elif in_block == False and line[i:i + 2] == '/*':
in_block = True
i += 2
elif in_block == True and line[i:i + 2] == '*/':
i += 2
in_block = False
elif in_block == True:
i += 1
elif in_block == False:
buffer.append(line[i])
i += 1
if (in_block == False):
buffer_str = ''.join(buffer)
if buffer_str:
res.append(buffer_str)
buffer.clear()
return res
s = Solution()
print(s.removeComments(source = ["struct Node{", " /*/ declare members;/**/", " int size;", " /**/int val;", "};"]))
如果行状态和字符状态考虑不清楚,很容易写出下面的错误代码:
from typing import List
from collections import defaultdict
class Solution:
def removeComments(self, source: List[str]) -> List[str]:
buffer, res = [], []
in_block = False
for line in source:
l, flg = len(line), False
for i in range(l):
if in_block == False and line[i:i + 2] == '//':
res.append(line[:i])
flg = True
break
elif in_block == False and line[i:i + 2] == '/*':
buffer.append(line[:i])
in_block = True
flg = True
elif in_block == True and line[i:i + 2] == '*/':
if (i + 2 < l):
buffer.append(line[i + 2:])
if (''.join(buffer)):
res.append(''.join(buffer))
buffer.clear()
in_block = False
flg = True
break
if (flg == False and in_block == False):
res.append(line)
return res
s = Solution()
print(s.removeComments(source = ["/*Test program */", "int main()", "{ ", " // variable declaration ", "int a, b, c;", "/* This is a test", " multiline ", " comment for ", " testing */", "a = b + c;", "}"]))

该博客讨论了一道编程题目,涉及C++源代码中两种类型的注释(行注释和块注释)的移除。题目要求编写算法删除源代码中的注释,同时保持源代码的原始格式。博客提供了正确和错误的代码实现示例,并指出了解决此类问题时状态机跳转的复杂性。
411

被折叠的 条评论
为什么被折叠?



