A simple regex search and replacement in php for m

2019-02-09 13:47发布

Can you post a regex search and replacement in php for minifying/compressing javascript?

For example, here's a simple one for CSS

  header('Content-type: text/css');
  ob_start("compress");
  function compress($buffer) {
    /* remove comments */
    $buffer = preg_replace('!/\*[^*]*\*+([^/][^*]*\*+)*/!', '', $buffer);
    /* remove tabs, spaces, newlines, etc. */
    $buffer = str_replace(array("\r\n", "\r", "\n", "\t", '  ', '    ', '    '), '', $buffer);
    return $buffer;
  }

  /* put CSS here */

  ob_end_flush();

And here's one for html:

<?php
/* Minify All Output - based on the search and replace regexes. */
function sanitize_output($buffer)
{
    $search = array(
        '/\>[^\S ]+/s', //strip whitespaces after tags, except space
        '/[^\S ]+\</s', //strip whitespaces before tags, except space
        '/(\s)+/s'  // shorten multiple whitespace sequences
        );
    $replace = array(
        '>',
        '<',
        '\\1'
        );
  $buffer = preg_replace($search, $replace, $buffer);
    return $buffer;
}
ob_start("sanitize_output");
?>
<html>...</html>

But what about one for javascript?

2条回答
霸刀☆藐视天下
2楼-- · 2019-02-09 14:31

I'm writing on my own minifier because I have some PHP inside. There is still one not solved problem. Preg_replace cannot handle quotes as boundary, or better it cannot count pair and impair quotes. Into the bargain there are double quotes, escaped double quotes, single quotes and escaped single quotes. Here are just some interesting preg-functions.

$str=preg_replace('@//.*@','',$str);//delete comments
$str=preg_replace('@\s*/>@','>',$str);//delete xhtml tag slash ( />)
$str=str_replace(array("\n","\r","\t"),"",$str);//delete escaped white spaces
$str=preg_replace("/<\?(.*\[\'(\w+)\'\].*)\?>/","?>$1<?",$str);//rewrite associated array to object
$str=preg_replace("/\s*([\{\[\]\}\(\)\|&;]+)\s*/","$1",$str);//delete white spaces between brackets
$count=preg_match_all("/(\Wvar (\w{3,})[ =])/", $str, $matches);//find var names
$x=65;$y=64;
for($i=0;$i<$count;$i++){
   if($y+1>90){$y=65;$x++;}//count upper case alphabetic ascii code
   else $y++;
   $str=preg_replace("/(\W)(".$matches[$i]."=".$matches[$i]."\+)(\W)/","$1".chr($x).chr($y)."+=$3",$str);//replace 'longvar=longvar+'blabla' to AA+='blabla' 
   $str=preg_replace("/(\W)(".$matches[$i].")(\W)/","$1".chr($x).chr($y)."$3",$str);//replace all other vars
   }
//echo or save $str.
?>

You may do similarly with function names:

$count= preg_match_all("/function (\w{3,})/", $str, $matches);

If you want to see the replaced vars, put the following code in the for-loop:

echo chr($x).chr($y)."=".$matches[$i]."<br>";

Separate php from JS by:

 $jsphp=(array)preg_split("/<\?php|\?>/",$str);
 for($i=0;$i<count($jsphp);$i++){
    if($i%2==0){do something whith js clause}
    else {do something whith PHP clause}
    }

This is only a draft. I'm always happy for suggestions. Hope it was Englisch...

查看更多
SAY GOODBYE
3楼-- · 2019-02-09 14:44

A simple regex for minifying/compressing javascript is unlikely to exist anywhere. There are probably several good reasons for this, but here are a couple of these reasons:

Line breaks and semicolons Good javascript minifiers remove all extra line breaks, but because javascript engines will work without semicolons at the end of each statement, a minifier could easily break this code unless it is sophisticated enough to watch for and handle different coding styles.

Dynamic Language Constructs Many of the good javascript minifiers available will also change the names of your variables and functions to minify the code. For instance, a function named 'strip_white_space' that is called 12 times in your file might be renamed simple 'a', for a savings of 192 characters in your minified code. Unless your file has a lot of comments and/or whitespace, optimizations like these are where the majority of your filesize savings will come from.

Unfortunately, this is much more complicated than a simple regex should try to handle. Say you do something as simple as:

var length = 12, height = 15;
    // other code that uses these length and height values

var arr = [1, 2, 3, 4];
for (i = (arr.length - 1); i >= 0; --i) {
    //loop code
}

This is all valid code. BUT, how does the minifier know what to replace? The first "length" has "var" before it (but it doesn't have to), but "height" just has a comma before it. And if the minifier is smart enough to replace the first "length" properly, how smart does it have to be know NOT to change the word "length" when used as a property of the array? It would get even more complicated if you defined a javascript object where you specifically defined a "length" property and referred to it with the same dot-notation.

Non-regex Options Several projects exist to solve this problem using more complex solutions than just a simple regex, but many of them don't make any attempt to change variable names, so I still stick with Dean Edwards' packer or Douglas Crockford's JSMin or something like the YUI Compressor.

PHP implementation of Douglas Crockford's JSMin

https://github.com/mrclay/minify

查看更多
登录 后发表回答