I have an XML document that looks like this:
<Data
xmlns="http://www.domain.com/schema/data"
xmlns:dmd="http://www.domain.com/schema/data-metadata"
>
<Something>...</Something>
</Data>
I am parsing the information using SimpleXML in PHP. I am dealing with arrays and I seem to be having a problem with the namespace.
My question is: How do I remove those namespaces? I read the data from an XML file.
Thank you!
If you're using XPath then it's a limitation with XPath and not PHP look at this explanation on xpath and default namespaces for more info.
More specifically its the xmlns=""
attribute in the root node which is causing the problem. This means that you'll need to register the namespace then use a QName thereafter to refer to elements.
$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
$feed->registerXPathNamespace("a", "http://www.domain.com/schema/data");
$result = $feed->xpath("a:Data/a:Something/...");
Important: The URI used in the registerXPathNamespace
call must be identical to the one that is used in the actual XML file.
I found the answer above to be helpful, but it didn't quite work for me.
This ended up working better:
// Gets rid of all namespace definitions
$xml_string = preg_replace('/xmlns[^=]*="[^"]*"/i', '', $xml_string);
// Gets rid of all namespace references
$xml_string = preg_replace('/[a-zA-Z]+:([a-zA-Z]+[=>])/', '$1', $xml_string);
The following PHP code automatically detects the default namespace specified in the XML file under the alias "default". No all xpath queries have to be updated to include the prefix default:
So if you want to read XML files rather they contain an default NS definition or they don't and you want to query all Something
elements, you could use the following code:
$xml = simplexml_load_file($name);
$namespaces = $xml->getDocNamespaces();
if (isset($namespaces[''])) {
$defaultNamespaceUrl = $namespaces[''];
$xml->registerXPathNamespace('default', $defaultNamespaceUrl);
$nsprefix = 'default:';
} else {
$nsprefix = '';
}
$somethings = $xml->xpath('//'.$nsprefix.'Something');
echo count($somethings).' times found';
To remove the namespace completely, you'll need to use Regular Expressions (RegEx). For example:
$feed = file_get_contents("http://www.sitepoint.com/recent.rdf");
$feed = preg_replace("/<.*(xmlns *= *[\"'].[^\"']*[\"']).[^>]*>/i", "", $feed); // This removes ALL default namespaces.
$xml_feed = simplexml_load_string($feed);
Then you've stripped any xml namespaces before you load the XML (be careful with the regex through, because if you have any fields with something like:
<![CDATA[ <Transfer xmlns="http://redeux.example.com">cool.</Transfer> ]]>
Then it will strip the xmlns from inside the CDATA which may lead to unexpected results.