My idea is to somehow minify HTML code in server-side, so client receive less bytes.
What do I mean with "minify"?
Not zipping. More like, for example, jQuery creators do with .min.js versions. In other words, I need to remove unnecessary white-spaces and new-lines, but I can't remove so much that presentation of HTML changes (for example remove white-space between actual words in paragraph).
Is there any tools that can do it? I know there is HtmlPurifier. Is it able to do it? Any other options?
P.S. Please don't offer regex'ies. I know that only Chuck Norris can parse HTML with them. =]
You can use the Pretty Diff tool: http://prettydiff.com/?m=minify&html It will also minify any CSS and JavaScript in the HTML code, and the minification occurs in a regressive manner so to not prevent future beautification of the HTML back to readable form.
A bit late but still... By using output_buffering it is as simple as that:
Yes, here's a tool you could include into a build process or work into a web cache layer: http://code.google.com/p/htmlcompressor/
Or, if you're looking for a tool to minify HTML that you paste in, try: http://www.willpeavy.com/minifier/
You could parse the HTML code into a DOM tree (which should keep content whitespace in the nodes), then serialise it back into HTML, without any prettifying spaces.