Minifying HTML

2020-06-02 08:14发布

I've googled around but can't find any HTML minifacation scripts.

It occoured to me that maybe there's nothing more to HTML minifacation than removing all unneeded whitespace.

Am I missing something or has my Google Fu been lost?

标签: html minify
11条回答
Fickle 薄情
2楼-- · 2020-06-02 08:35

You have to be careful when removing stuff from HTML as it's a fragile language. Depending on how your pages are coded some of that whitespace might be more significant; also if you have CSS styles such as white-space: pre then you may need to keep the whitespace. Plus there are numerous browser bugs, etc, and basically every character in an HTML file might be there to satisfy some requirement or appease some browser.

In my opinion your best bet is to design the pages well with CSS techniques (I was recently able to take an important page on the site I work for and reduce it's size by 50% just by recoding it using CSS instead of tables and nested style="..." attributes). Then, use GZip to reduce the size of your pages for browsers that understand gzip. This will save bandwidth while preserving the structure of the html.

查看更多
兄弟一词,经得起流年.
3楼-- · 2020-06-02 08:37

There's a pretty lengthy discussion on this Wordpress blog about this topic. You can find a very lengthy proposed solution using PHP and HTML Tidy there.

查看更多
Juvenile、少年°
4楼-- · 2020-06-02 08:39

Outside of HTML Tidy/removing white space as the other answers mentioned, there isn't much.

This is more of a manual task pulling out style attributes into CSS (hopefully you're not using FONT tags, etc.), using fewer tags and attributes where possible (like not embedding <strong> tags in an element but using CSS to make the whole element font-weight: bold, unless of course it makes semantic sense to use >strong<), etc.

查看更多
相关推荐>>
5楼-- · 2020-06-02 08:39

Yes I guess it's pretty much removing whitespace and comments. You cannot replace identifiers with shorter ones like in javascript, since chances are that CSS classes or javascript will depend on those identifiers.

Also, you should be careful when removing whitespace and make sure that there is always at least whitespace character left, otherwise allyourtextwilllooklikethis.

查看更多
贼婆χ
6楼-- · 2020-06-02 08:41

Here is a minifier for HTML5 written in PHP.

<?PHP
$in=file_get_contents('path/to/source.html');

//Strips spaces if there are more than one.
$in=preg_replace('/\s{2,}/m',' ',$in);
//trim
$in=preg_replace('/^\s+|\s+$/m','',$in);
/*Strips spaces between tags. 
Use (&nbsp; or &shy; or better) padding or margin if necessary, otherwise the html
parser appends a one space textnode.*/  
$in=preg_replace('/ ?> < ?/','><',$in);
//Removes tag end slash.
$in=preg_replace('@ ?/>@','>',$in);
//Removes HTML comments except conditional IE comments.
$in=preg_replace('/<!--[^\[]*?-->/','',$in);
//Removes quotes where possible.
$in=preg_replace('/="([^ \'"\=><]+)"/','=$1',$in);
$in=preg_replace("/='([^ '\"\=><]+)'/",'=$1',$in);

file_put_contents('path/to/min.html',$in);
?>

After that you have a one line, shorter html code.

Better you make an array from the regular expressions, but aware to escape the back slashes.

查看更多
登录 后发表回答