Javascript comment stripper [closed]

2019-02-12 01:56发布

问题:

I'm looking for some tool to remove cooments from Javascript sources. I was able to Google some, but none of them satisfied the following requirement: Everything else should be left as it is, in particular white space is not removed, BUT if a comment takes a whole line, the line is removed too.

Shortly, I want to be able to go from a nicely formatted source with comments to an equally formatted source without comments. Lines which only contain comments are removed, and traliing comments are removed together with the trailing spaces. All the rest is left as it is.

Do you know any tool for such a job?

EDIT: I try to be more specific. Using regular expressions is not possible, as the characters // or /* can also appear inside strings, regular expressions and so on.

The tool should take this input

var a = true;

//the following code is every useful
var b = 2;//really, really useful
 /**
Never, ever do this
var c = 3;
  */
var d = 4;

and give this output

var a = true;

var b = 2;
var d = 4;

回答1:

Here's some code I whipped up: Check it out: here

Also here is an example of my code you can test RIGHT NOW in a webpage

Here's one I didn't write that could be handy, though his code will fail on certain regex literals: http://james.padolsey.com/javascript/removing-comments-in-javascript/

EDIT: The code I wrote is as is. I am not updating it as it is something I wrote when I was a teenager and rather new to programming. If there is a bug, you can fix it.



回答2:

Use Google's Closure Compiler with WHITE_SPACE_ONLY and PRETTY_PRINT -- the only thing that it will do is remove the comments (Unless of course you don't format your code in the way that PRETTY_PRINT does.)

It turns this:

// This function alerts a name
function hello(name) {
    /**
    * One lone
    * multi-line
    * comment
    */
    alert('Hello, ' + name);
}
hello('New user');

Into this:

function hello(name) {
  alert("Hello, " + name)
}
hello("New user");


回答3:

Found a pretty sweet solution here: http://blog.ostermiller.org/find-comment

Excerpt:

Now we just need to modify the comment end to allow any number of *:

/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/

We now have a regular expression that we can paste into text editors that support regular expressions. Finding our comments is a matter of pressing the find button. You might be able to simplify this expression somewhat for your particular editor. For example, in some regular expression implementations, [^] assumes the [\r\n] and all the [\r\n] can be removed from the expression.

This is easy to augment so that it will also find // style comments:

(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|(//.*)

Be sure to read the caveats, however, as this will remove comments from with comments, or can uncomment commented code improperly. Worked perfectly for me, however :-)



回答4:

Library decomment does exactly what you described:

Everything else should be left as it is, in particular white space is not removed, BUT if a comment takes a whole line, the line is removed too.

And it also supports JSON5, JavaScript ES6, CSS and HTML.



回答5:

Just a small insight that might help you make your complex regular expression much simpler..

feel free to later apply any of the tips in answers above..

var text = ".................."; //assuming staring point

........

text = text
.replace(/\r/g,"##R##")
.replace(/\n/g,"##N##")

.replace(/\/\*(.*)\*\//g,"")

.replace(/##R##/g,"\r")
.replace(/##N##/g,"\n")

applying a little (independent) replacement of \r and \n will simplify your regex A LOT!,

originally even with g and m modifiers (global and "greedy" flags), you still won't succeed removing the comments (unless you custom-build a "character-walker" loop, or run the same reg-ex multiple times...) this is due some characteristics of the regular-expression matching left in limbo since ECMAScript 4 (ECMA-262)


What smart thing are doing here that is worth mentioning ?

This way we apply a nifty little trick known in Discrete mathematics(languages and grammar) as "replacement outside of our grammar", I'm using this unconventionally to "protect" the \r and \n areas in the text without actually applying too much computational-power to process them (as in cut/assemble etc..)

Here it's kind of a gamble since, essentially, ##R## and ##N## (although not so common), might be an existing phrase, but this is not an issue since the replacement can be infinitesimally-more complex.

In short, The regular-expressions will be simpler, The regular-replacements will work as intended without that whitespace-bug.. And \n and \r will be restored to their original placement, intact.



回答6:

naive one liner stripper:

var noComments = text.replace(/\/\*(.|[\r\n])*?\*\//g, '').replace(/\/\/.*/gm, '');

DISCLAIMER:

"naive" means:

  1. it strips across everywhere, say if you have:

    var a = "/*";
    someImportantLogicHere();
    var b = "*/";
    

    then you will get

    var a = "";
    
  2. order in which you apply these regexps matters, you will get different results applying it in different order

but for 95% other cases it's simple and paractical



回答7:

You can use babel "comments": false to achieve this. I have wrote a demo to for the-super-tiny-compiler, please check https://github.com/gengjiawen/the-super-tiny-compiler.