After learning how to "correctly" unset a node, I noticed that using PHP's unset() function leaves the tabs and spaces behind. So now I have this big chunk of white space in between nodes at times. I'm wondering if PHP iterates through blank spaces/returns/tabs and whether it would eventually slow down the system.
I'm also asking whether there's an easy to remove the space unset leaves behind?
Thanks, Ryan
ADDED NOTE:
This is how I removed the whitespaces after unsetting a node and it worked for me.
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->load($xmlPath);
$dom->save($xmlPath);
Wether it slows down the process: probably to little to care about.
And simpleXML is just that, simple. If you require a 'pretty' output, DOM is your friend:
Whitespace in XML is TextNodes, e.g.
is really
If you remove the
<bar>
node, you getI think SimpleXml wont allow you to access the Text nodes easily (maybe via XPath) but DOM does. See Wrikken's answer for details. Now that you know that whitespace is a node, you can also imagine that parsing it into a node takes up some cpu cycles. However, I'd say the speed impact is negliglible. When in doubt, do a benchmark with some real world data.
EDIT: Proof that whitespace is really nodes
gives
Quick answers to the questions asked:
No, PHP (or libxml) doesn't really iterate over it. Having more whitespace theorically slows down the system, although it's so small it can't be measured directly. You could test that by yourself by removing all whitespace from your XML. It wouldn't make it faster.
No easy way I'm afraid. You can import your SimpleXML stuff to DOM and use
formatOutput
to completely remodel the whitespace, as suggested in another answer, or you can use a third party library that will do it for you, but you won't find an easy, built-in way to do that.It's actually Libxml that does the XML parsing, whitespace is read by the parser the same as every other character in the input stream (or file). Most of the PHP xml APIs use Libxml under the hood (XmlReader, XmlWriter, SimpleXml Xslt, Dom...) - some of them give you access to whitespace (e.g. Dom, XmlReader), some don't (e.g. SimpleXML)