XPath namespace wildcard best practice

2020-01-30 01:33发布

问题:

In our case we have dynamic XML tag something like this

<ab:SomeProcessResponse xmlns:ab="http://something.com/xyz"
                        xmlns:cd="http://something.com/lmno">

Or sometimes I might get response as

<def:SomeProcessResponse xmlns:def="http://something.com/xyz"
                         xmlns:cd="http://something.com/lmno">

What is best practice I should follow in selecting node SomeProcessResponse in such cases?

回答1:

The best practice for selecting a namespaced XML element is to specify the namespace, not to ignore it via any sort of "wildcard" mechanism.

For compliant XPath processors, there is always a mechanism to allow the binding of a namespace prefix (eg: ab) to a namespace URI (eg: http://something.com/xyz). It is not even necessary to use the same namespace prefix (ab) as is being used in the source XML; only the namespace URI (http://something.com/xyz) must match. The only challenge, however, is that XPath itself does not have a mechanism for binding namespace prefixes to namespace URIs. How to do so is dependent upon the facilities offered by the library (eg: libxml2, javax.xml.xpath, etc) or hosting language (eg: XSLT).

In order to provide a pure XPath answer, or sometimes because of (commonly irrational) aversion to namespaces, you'll sometimes see a wildcard mechanism used. A common wildcard mechanism is to use local-name() to only reference the local name (eg: SomeProcessResponse) independent of the namespace. The problem with this is that not only does it bypass the namespace prefix (eg: ab), it also bypasses the namespace URI (http://something.com/xyz), and the namespace URI is integral to the name and an important part of associating other facilities with the element. Such other facilities include validation and OO class mappings, for example.

So, yes, there are wildcard mechanisms for dodging namespaces, but the best practice is to use the facilities of the hosting language/library to associate namespace prefixes with namespace URIs, not to avoid namespaces via wildcards.



回答2:

There's no need for a "wildcard" here as both examples are the same XML - an element with the local name SomeProcessResponse and the namespace URI http://something.com/xyz. The fact that they use two different prefixes is irrelevant, the prefix bindings used in an XPath expression are up to the XPath library and don't have to match those used in the document.

You need to use the facilities your XPath library provides to bind a prefix such as xyz to the http://something.com/xyz namespace URI, then an XPath of xyz:SomeProcessResponse will match both examples.

To give a more specific answer I'd need to know which programming language and which XPath API you're using.


Edit: in your comment you say you're using JavaScript - in that case you can supply namespace bindings by passing a function to document.evaluate

var result = document.evaluate("//xyz:SomeProcessResponse", document, function(prefix) {
  if(prefix == "xyz") {
    return "http://something.com/xyz";
  } else {
    return null;
  }
}, XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null);


回答3:

If you are declaring the namespaces, either response will be identified correctly.

If what you need is an xpath to locate elements regardless of namespaces, you can use this:

//*[local-name()='SomeProcessResponse']

But bear in mind, that this will be true for all of the following cases:

<ab:SomeProcessResponse xmlns:ab="http://something.com/xyz" xmlns:cd="http://something.com/lmno">
<def:SomeProcessResponse xmlns:def="http://something.com/xyz" xmlns:cd="http://something.com/lmno">
<ab:SomeProcessResponse xmlns:ab="http://somethingCompletelyDifferent.com/xyz">
<SomeProcessResponse>


标签: xml xpath