This is starting to piss me off real bad. I have this XML code:
Updated with correct namespaces
<?xml version="1.0" encoding="utf-8"?>
<Infringement xsi:schemaLocation="http://www.movielabs.com/ACNS http://www.movielabs.com/ACNS/ACNS2v1.xsd" xmlns="http://www.movielabs.com/ACNS" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Case>
<ID>...</ID>
<Status>Open</Status>
</Case>
<Complainant>
<Entity>...</Entity>
<Contact>...</Contact>
<Address>...</Address>
<Phone>...</Phone>
<Email>...</Email>
</Complainant>
<Service_Provider>
<Entity>...</Entity>
<Address></Address>
<Email>...</Email>
</Service_Provider>
<Source>
<TimeStamp>...</TimeStamp>
<IP_Address>...</IP_Address>
<Port>...</Port>
<DNS_Name></DNS_Name>
<Type>...</Type>
<UserName></UserName>
<Number_Files>1</Number_Files>
<Deja_Vu>No</Deja_Vu>
</Source>
<Content>
<Item>
<TimeStamp>...</TimeStamp>
<Title>...</Title>
<FileName>...</FileName>
<FileSize>...</FileSize>
<URL></URL>
</Item>
</Content>
</Infringement>
And this PHP code:
<?php
$data = urldecode($_POST["xml"]);
$newXML = simplexml_load_string($data);
var_dump($newXML->xpath("//ID"));
?>
I've dumped only $newXML and gotten tons of data, but the only xPath I've run that returned anything but an empty array was "*"
Isn't "//ID" supposed to find all ID nodes in the document? Why isn't it working?
Thanks
Your XML document's root element seems to have default namespace with URI "http://www.movielabs.com/ACNS". This means that all elements in your document belong to that namespace. The problem is that all XPath expressions that do not have a namespace prefix are searching for elements that don't belong to any namespace. To search for elements (or attributes...) from a certain namespace you need to register the namespace URI to some prefix and then use this prefix in your XPath expression.
In case of PHP's simpleXML it's done something like this
prefix
can be practically any text, but the namespace URI must match exactly the one used in your XML document.So what was returned from
var_dump($newXML->xpath("*"));
?<Infringement>
?If the problem is namespaces, try this:
This will match any element in the document whose name is 'ID', regardless of namespace.
Wait, what? Are you sure you showed us all the xmlns-related attributes in the document?
Update: The question was edited to show that the XML really does have a default namespace declaration. That explains the original problem: your XPath expression selects ID elements that are in no namespace, but the elements in your document are in the movielabs ACNS namespace, thanks to the default namespace declaration.
The declaration
xmlns="http://www.movielabs.com/ACNS"
on an element means "this element and all descendants that don't have a namespace prefix (like ID) are in the namespace represented by the namespace URI 'http://www.movielabs.com/ACNS'." (Unless an intervening descendant has a different default namespace declaration, which would shadow this one.)So use my
local-name()
answer above to ignore namespaces, or use jasso's technique to specify the movielabs ACNS and use it as intended.use this for any namespace:
You have an xml namespace defined in the document element (the
xmlns="http://www.movielabs.com/ACNS"
attribute). The namespace is the URLhttp://www.movielabs.com/ACNS
. This has to by a globally unique string (an URN). Because of that URLs are used often. The chance that someone uses your domain for a namespace is very low and you can put some documentation at the URL.The XML parser resolves the namespaces. The node gets 4 properties.
For
<Infringement xmlns="http://www.movielabs.com/ACNS"/>
:For
<movie:Infringement xmlns:movie="http://www.movielabs.com/ACNS"/>
:$namespaceURI
and$localName
are stable. The other two depend on prefix. The prefix is an alias for the namespace. The namespace uri is long and complex, it would make the XML a lot more difficult to read to write if used on each element/attribute. But you can interpret the element nodes like:So the namespace is the one thing that defines what the nodes mean, not the prefix/alias. Prefixes can be redefined on a sub element.
Xpath uses the same concept with an own resolver. You register your own prefixes for a namespace. So it doesn't matter how the prefixes are used in the XML, only the namespace uri has to match.
In DOM you do this on the DOMXPath instance:
In SimpleXML, you can register the namespace on the SimpleXMLElement.
HINT: The default namespace is only used for elements, attributes are in the "no/empty namespace" unless they have a prefix.
I'm not well-versed in PHP's XML API, but I suspect the problem lies in the namespaces. Depending on how that xpath method works, it may be searching for ID elements with an empty namespace. Your ID elements inherit their namespace from the root element.