XPath queries in IE use zero-based indexes but the

2019-04-27 18:15发布

问题:

The Problem

I am converting a relatively large piece of Javascript that currently only works on Internet Explorer in order to make it work on the other browsers as well. Since the code uses XPath extensively we made a little compatibility function to make things easier

function selectNodes(xmlDoc, xpath){
    if('selectNodes' in xmlDoc){
        //use IE logic
    }else{
        //use W3C's document.evaluate
    }
}

This is mostly working fine but we just came across the limitation that positions in IE are zero-based but in the W3C model used by the other browsers they are one-based. This means that to get the first element we need to do //books[0] in IE and //books[1] in the other browsers.

My proposed solution

The first thought was using a regex to add one to all indexes that appear in the queries if we are using the document.evaluate version:

function addOne(n){ return 1 + parseInt(nStr, 10); }

xpath = xpath.replace(
    /\[\s*(\d+)\s*\]/g,
    function(_, nStr){ return '[' + addOne(nStr) + ']'; }
);

My question

Is this regex based solution reasonably safe?

  • Are there any places it will convert something it should not?
  • Are there any places where it will not convert something it should?

For example, it would fail to replace the index in //books[position()=1] but since IE doesn't appear to support position() and our code is not using that I think this particular case would not be a problem.


Considerations

  • I downloaded Sarissa to see if they have a way to solve this but after looking at the source code apparently they don't?

  • I want to add one to the W3C version instead of subtracting one in the IE version to ease my conversion effort.


In the end

We decided to rewrite the code to use proper XPath in IE too by setting the selection language

xmlDoc.setProperty("SelectionLanguage", "XPath");

回答1:

we just came across the limitation that positions in IE are zero-based but in the W3C model used by the other browsers they are one-based. This means that to get the first element we need to do //books[0] in IE and //books[1] in the other browsers.

Before doing any XPath selection, specify:

xmlDoc.setProperty("SelectionLanguage", "XPath");

MSXML3 uses a dialect of XSLT/XPath that was in use before XSLT and XPath became W3C Recommendations. The default is "XSLPattern" and this is what you see as behavior.

Read more on this topic here:

http://msdn.microsoft.com/en-us/library/windows/desktop/ms754679(v=vs.85).aspx



回答2:

Why not modify the original expressions, so that this:

var expr = "books[1]";

...becomes:

var expr = "books[" + index(1)  + "]";

...where index is defined as (pseudocode):

function index(i) {
    return isIE ? (i - 1) : i;
}