Can't figure out why I can't retrieve a si

2019-03-01 05:14发布

I can't figure out why I can't retrieve a simple string with XPath with this very simple snippet

var page = new WebPage();
page.open('http://free.fr', function (status) {
    if (status !== 'success') {
        console.log('Unable to access network');
    } else {
        function getElementByXpath(path) {
          return document.evaluate(path, document, null, XPathResult.STRING_TYPE, null).stringValue;
        }

        console.log( getElementByXpath("//title/text()") );
    }
    phantom.exit();
}

always return nothing.

What I missed to print the title value?

1条回答
再贱就再见
2楼-- · 2019-03-01 05:48

PhantomJS has two contexts. Only the DOM context (page context) has access to the DOM, but it is sandboxed. You get access to the DOM context through page.evaluate. But remember that:

Note: The arguments and the return value to the evaluate function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine.

Closures, functions, DOM nodes, etc. will not work!

This means that you cannot pass any DOM node that you find to the outer context. Although, there is a document object outside of the DOM context, but it doesn't do anything. It's only a relict of the way PhantomJS is written on top of QtWebkit.

Here's an example fix:

var page = new WebPage();
page.onConsoleMessage = function(msg){
    console.log("remote: " + msg);
};
page.open('http://google.fr', function (status) {
    if (status !== 'success') {
        console.log('Unable to access network');
    } else {
        page.evaluate(function(){
            function getElementByXpath(path) {
              return document.evaluate(path, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
            }

            console.log( getElementByXpath("//head/title/text()").textContent );
        });
    }
    phantom.exit();
});
查看更多
登录 后发表回答