I'm using the DOM extension in PHP to build some HTML documents, and I want the output to be formatted nicely (with new lines and indentation) so that it's readable, however, from the many tests I've done:
- "formatOutput = true" doesn't work at all with saveHTML(), only saveXML()
- Even if I used saveXML(), it still only works on elements created via the DOM, not elements that are included with loadHTML(), even with "preserveWhiteSpace = false"
If anyone knows differently I'd really like to know how they got it to work.
So, I have a DOM document, and I'm using saveHTML() to output the HTML. As it's coming from the DOM I know it is valid, there's no need to "Tidy" or validate it in any way.
I'm simply looking for a way to get nicely formatted output from the output I receive from the DOM extension.
NB. As you may have guessed, I don't want to use the Tidy extension as a) it does a lot more that I need it too (the markup is already valid) and b) it actually makes changes to the HTML content (such as the HTML 5 doctype and some elements).
Follow Up:
OK, with the help of the answer below I've worked out why the DOM extension wasn't working. Although the given example works, it still wasn't working with my code. With the help of this comment I found that if you have any text nodes where isWhitespaceInElementContent() is true no formatting will be applied beyond that point. This happens regardless of whether or not preserveWhiteSpace is false. The solution is to remove all of these nodes (although I'm not sure if this may have adverse effects on the actual content).
You can use the code for the hl_tidy function of the htmLawed library.
you're right, there seems to be no indentation for HTML (others are also confused). XML works, even with loaded code.
result:
the same with saveXML() ...
probably forgot to set preserveWhiteSpace=false before loadHTML?