How can I remove all spacing characters before and after a XML field?
<data version="2.0">
<field>
1
</field>
<field something=" some attribute here... ">
2
</field>
</data>
Notice that spacing before 1 and 2 and 'some attribute here...', I want to remove that with PHP.
if(($xml = simplexml_load_file($file)) === false) die();
print_r($xml);
Also the data doesn't appear to be string, I need to append (string) before each variable. Why?
Since
simplexml_load_file()
reads data into an array, you could do something like this:To do that in PHP you first have to convert the document into a DOMDocument so that you can address the nodes you want to normalize the whitespace within properly via DOMXPath. The (xpath in) SimpleXMLElement is too limited to access text-nodes precisely enough as it would be needed for this operation.
An Xpath-query to access all text-nodes that are within leaf-elements and all attributes is:
Given that
$xml
is a SimpleXMLElement you could do white-space normalization like in the following example:You could perhaps stretch this to all text-nodes (as suggested in related Q&A), but this might require document normalization under circumstance. As
text()
in Xpath does not differ between text-nodes and Cdata-sections, you might want to skip on these type of nodes (DOMCdataSection) or expand them into text-nodes when loading the document (use theLIBXML_NOCDATA
option for that) to achieve more useful results.Because it's an object of type SimpleXMLElement, if you want the string value of such an object (element), you need to cast it to string. See as well the following reference question:
And last but not least: don't trust
print_r
orvar_dump
when you use it on a SimpleXMLElement: it's not showing the truth. E.g. you could override__toString()
which could also solve your issue:Even though casting to string would normally apply (e.g. with
echo
), the output ofprint_r
still would not reflect these changes. So better not rely on it, it can never show the whole picture.Full example code to this answer (Online Demo):
You may want to use something like this:
I haven't tried this, but you can find more on this at http://www.lonhosford.com/lonblog/2011/01/07/php-simplexml-load-xml-file-preserve-cdata-remove-whitespace-between-nodes-and-return-json/.
Note that the spaces between the opening and closing brackets (
<x> _space_ </x>
) and the attributes (<x attr=" _space_ ">
) are actually part of the XML document's data (in contrast with the spaces between<x> _space_ <y>
), so I would suggest that the source you use should be a bit less messy with spaces.