Remove comments in a string

2019-09-21 00:19发布

问题:

 private static String filterString(String code) {
     String partialFiltered = code.replaceAll("/\\*.*\\*/", "");
     String fullFiltered = partialFiltered.replaceAll("//.*(?=\\n)", "");
     return fullFiltered;
 }

I tried above code to remove all comments in a string but it isn't working - please help.

回答1:

Works with both // single and multi-line /* comments */.

String sourceCode =
         "/*\n"
        + " * Multi-line comment\n"
        + " * Creates a new Object.\n"
        + " */\n"
        + "public Object someFunction() {\n"
        + " // single line comment\n"
        + " Object obj =  new Object();\n"
        + " return obj; /* single-line comment */\n"
        + "}";

System.out.println(sourceCode.replaceAll(
        "//.*|/\\*((.|\\n)(?!=*/))+\\*/", ""));

Input :

/*
 * Multi-line comment
 * Creates a new Object.
 */
public Object someFunction() {
    // single line comment
    Object obj =  new Object();
    return obj; /* single-line comment */
}

Output :

public Object someFunction() {

    Object obj =  new Object();
    return obj; 
}


回答2:

How about....

      private static String filterString(String code) {
       return code.Replace("//", "").Replace("/*", "").Replace("*/", "");

   }


回答3:

Replace below code

partialFiltered.replaceAll("//.*(?=\\n)", "");

With,

partialFiltered.replaceAll("//.*?\n","\n");



回答4:

You need to use (?s) at the start of your partialFiltered regex to allow for comments spanning multiple lines (e.g. see Pattern.DOTALL with String.replaceAll).

But then the .* in the middle of /\\*.*\\*/ uses a greedy match so I'd expect it to replace the whole lot between two separate comment blocks. E.g., given the following:

/* Comment #1 */
for (i = 0; i < 10; i++)
{
    i++
}
/* Comment #2 */

Haven't tested this so am risking egg on my face but would expect it to remove the whole lot including the code in the middle rather than just the two comments. One way to prevent would be to use .*? to make the inner matching non-greedy, i.e. to match as little as possible:

String partialFiltered = code.replaceAll("(?s)/\\*.*?\\*/", "");

Since the fullFiltered regex doesn't begin with (?s), it should work without the (?=\\n) (since the replaceAll regex doesn't span multiple lines by default) - so you should be able to change it to:

String fullFiltered = partialFiltered.replaceAll("//.*", "");

There are also possible issues with looking for the characters denoting a comment, e.g. if they appear within a string or regular expression pattern but I'm assuming these aren't important for your application - if they are it's probably the end of the road for using simple regular expressions and you may need a parser instead...



回答5:

Maybe this can help someone:

return code.replaceAll(
                "((['\"])(?:(?!\\2|\\\\).|\\\\.)*\\2)|\\/\\/[^\\n]*|\\/\\*(?:[^*]|\\*(?!\\/))*\\*\\/", "$1");

Use this regexp to test ((['"])(?:(?!\2|\\).|\\.)*\2)|\/\/[^\n]*|\/\*(?:[^*]|\*(?!\/))*\*\/ here