Does extra space slow down the processor?

2019-06-10 10:03发布

After learning how to "correctly" unset a node, I noticed that using PHP's unset() function leaves the tabs and spaces behind. So now I have this big chunk of white space in between nodes at times. I'm wondering if PHP iterates through blank spaces/returns/tabs and whether it would eventually slow down the system.

I'm also asking whether there's an easy to remove the space unset leaves behind?

Thanks, Ryan

ADDED NOTE:

This is how I removed the whitespaces after unsetting a node and it worked for me.

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->load($xmlPath);
$dom->save($xmlPath);

4条回答
Emotional °昔
2楼-- · 2019-06-10 10:15

Wether it slows down the process: probably to little to care about.

And simpleXML is just that, simple. If you require a 'pretty' output, DOM is your friend:

<?php
$xml = '
<xml>
        <node>foo </node>
        <other>bar</other>
</xml>';
$x = new SimpleXMLElement($xml);
unset($x->other);
echo $x->asXML();

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->loadXML($xml);
$dom->documentElement->removeChild($dom->documentElement->lastChild);
echo $dom->saveXML();
查看更多
相关推荐>>
3楼-- · 2019-06-10 10:21

Whitespace in XML is TextNodes, e.g.

<foo>
    <bar>baz</bar>
</foo>

is really

<foo><- whitespace node
    -><bar>baz</bar><- whitespace node
-></foo>

If you remove the <bar> node, you get

<foo><- whitespace node
    -><- whitespace node
-></foo>

I think SimpleXml wont allow you to access the Text nodes easily (maybe via XPath) but DOM does. See Wrikken's answer for details. Now that you know that whitespace is a node, you can also imagine that parsing it into a node takes up some cpu cycles. However, I'd say the speed impact is negliglible. When in doubt, do a benchmark with some real world data.


EDIT: Proof that whitespace is really nodes

$xml = <<< XML
<foo>
    <bar>baz</bar>
</foo>
XML;

$dom = new DOMDocument;
$dom->loadXML($xml);
foreach($dom->documentElement->childNodes as $node) {
    var_dump($node);
}

gives

object(DOMText)#4 (0) {}
object(DOMElement)#6 (0) {}
object(DOMText)#4 (0) {}
查看更多
叼着烟拽天下
4楼-- · 2019-06-10 10:22

Quick answers to the questions asked:

I'm wondering if PHP iterates through blank spaces/returns/tabs and whether it would eventually slow down the system.

No, PHP (or libxml) doesn't really iterate over it. Having more whitespace theorically slows down the system, although it's so small it can't be measured directly. You could test that by yourself by removing all whitespace from your XML. It wouldn't make it faster.

I'm also asking whether there's an easy to remove the space unset leaves behind?

No easy way I'm afraid. You can import your SimpleXML stuff to DOM and use formatOutput to completely remodel the whitespace, as suggested in another answer, or you can use a third party library that will do it for you, but you won't find an easy, built-in way to do that.

查看更多
forever°为你锁心
5楼-- · 2019-06-10 10:27

It's actually Libxml that does the XML parsing, whitespace is read by the parser the same as every other character in the input stream (or file). Most of the PHP xml APIs use Libxml under the hood (XmlReader, XmlWriter, SimpleXml Xslt, Dom...) - some of them give you access to whitespace (e.g. Dom, XmlReader), some don't (e.g. SimpleXML)

查看更多
登录 后发表回答