722. Remove Comments

本文介绍了一种从C++源代码中移除单行和多行注释的方法,使用StringBuilder进行资源高效的字符串构建,避免了频繁创建新的字符串实例。
Description

Given a C++ program, remove comments from it. The program source is an array where source[i] is the i-th line of the source code. This represents the result of splitting the original source code string by the newline character \n.

In C++, there are two types of comments, line comments, and block comments.

The string // denotes a line comment, which represents that it and rest of the characters to the right of it in the same line should be ignored.

The string /* denotes a block comment, which represents that all characters until the next (non-overlapping) occurrence of / should be ignored. (Here, occurrences happen in reading order: line by line from left to right.) To be clear, the string // does not yet end the block comment, as the ending would be overlapping the beginning.

The first effective comment takes precedence over others: if the string // occurs in a block comment, it is ignored. Similarly, if the string /* occurs in a line or block comment, it is also ignored.

If a certain line of code is empty after removing comments, you must not output that line: each string in the answer list will be non-empty.

There will be no control characters, single quote, or double quote characters. For example, source = “string s = “/* Not a comment. */”;” will not be a test case. (Also, nothing else such as defines or macros will interfere with the comments.)

It is guaranteed that every open block comment will eventually be closed, so /* outside of a line or block comment always starts a new comment.

Finally, implicit newline characters can be deleted by block comments. Please see the examples below for details.

After removing the comments from the source code, return the source code in the same format.

Example 1:
Input:
source = ["/*Test program /", “int main()”, "{ ", " // variable declaration ", “int a, b, c;”, "/ This is a test", " multiline ", " comment for “, " testing */”, “a = b + c;”, “}”]

The line by line code is visualized as below:
/*Test program /
int main()
{
// variable declaration
int a, b, c;
/
This is a test
multiline
comment for
testing */
a = b + c;
}

Output: [“int main()”,"{ “,” “,“int a, b, c;”,“a = b + c;”,”}"]

The line by line code is visualized as below:
int main()
{

int a, b, c;
a = b + c;
}

Explanation:
The string /* denotes a block comment, including line 1 and lines 6-9. The string // denotes line 4 as comments.
Example 2:
Input:
source = [“a/comment", “line”, "more_comment/b”]
Output: [“ab”]
Explanation: The original source string is “a/comment\nline\nmore_comment/b”, where we have bolded the newline characters. After deletion, the implicit newline characters are deleted, leaving the string “ab”, which when delimited by newline characters becomes [“ab”].
Note:

The length of source is in the range [1, 100].
The length of source[i] is in the range [0, 80].
Every open block comment is eventually closed.
There are no single-quote, double-quote, or control characters in the source code.

Problem URL


Solution

给一个String[] 表示的程序文本,去除里面“//”和“/**/”形式的注释。

Use a Array list to stroe result, and a string builder to get new string with less resource(or string = “” is ok). A boolean multi is for multiple lines comment.

For every string in source, for every character in this string. If we are in a multi lines comment, judge if we could jump out, when jump out, use I++ for jump two characters(out of “*/”). If we are not in a multi lines comment, there are three circumstances.

  1. normal code, append it to stringbuilder.
  2. “//”, just ignore this string ,that is break.
  3. “/*” multi is true, jump to I++ posion.

After a string’s iteration is complete, if we are not in multi and string builder has characters, add it to res. Then new a new string builder.

Code
class Solution {
    public List<String> removeComments(String[] source) {
        StringBuilder sb = new StringBuilder();
        List<String> res = new ArrayList<>();
        boolean multi = false;
        for (String s : source){
            for (int i = 0; i < s.length(); i++){
                if (multi){
                    if (s.charAt(i) == '*' && i < s.length() - 1 && s.charAt(i+1) == '/'){
                        multi = false;
                        i++;
                    }
                }
                else{
                    if (s.charAt(i) == '/' && i < s.length() - 1 && s.charAt(i+1) == '/'){
                        break;
                    }
                    if (s.charAt(i) == '/' && i < s.length() - 1 && s.charAt(i+1) == '*'){
                        multi = true;
                        i++;
                    }
                    else{
                        sb.append(s.charAt(i));
                    }
                }
            }
            if (!multi && sb.length() > 0){
                res.add(sb.toString());
                sb = new StringBuilder();
            }
        }
        return res;
    }
}

Time Complexity: O(m * n)
Space Complexity: O(m * n)


Review

String builder would cost less resources than new a string every time.

删除代码中的注释通常是指去除文本编辑器中用于解释和指导程序员但并不参与程序运行的行。这些注释可能是单行或多行的,通常以井号 (#) 或者 /* ... */ 开始。这里是一个简单的例子,展示了如何在Python中移除单行注释: ```python # 这是原始的代码,带有注释 # 输出 "Hello, World!" print("Hello, World!") # 这里是去掉单行注释后的代码 print("Hello, World!") ``` 如果你要处理的是多行注释,比如使用 `""" ... """` 或 `''' ... '''` 来包围的文档字符串,你可以先定义一个函数来移除它们: ```python def remove_comments(code_string): start_delimiters = {'"': '"""', "'": "'''"} end_delimiters = {v: k for k, v in start_delimiters.items()} lines = code_string.split('\n') cleaned_lines = [] inside_docstring = False for line in lines: if not inside_docstring: if line.startswith(start_delimiters['"']) or line.startswith(start_delimiters[("'"]): inside_docstring = True else: cleaned_lines.append(line) else: if line.endswith(end_delimiters[start_delimiters['"']]) or line.endswith(end_delimiters[start_delimiters[("'']"))]: inside_docstring = False return '\n'.join(cleaned_lines) # 使用函数 code_with_comments = """ This is a multi-line comment. It explains what this block of code does. Now we'll print something. """ clean_code = remove_comments(code_with_comments) ``` 在这个示例中,`remove_comments` 函数会逐行检查代码,当遇到开始的文档字符串标记时进入内部逻辑,直到找到相应的结束标记才退出。 注意,这只是一个基础的实现,实际应用可能需要处理更复杂的情况,例如嵌套的文档字符串。如果你只需要针对特定语言的注释风格,那么可以根据该语言的标准来调整代码。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值