Get JSON page content with PhantomJS

2019-01-24 00:53发布

I would like to know how to parse JSON in phantomjs. Any page content is enclosed in html (<html><body><pre>{JSON string}</pre></body></html>). Is there an options to remove enclosing tags or asking for a different Content-Type as "application/json"? If not, what's the best way to parse it. Is it using jQuery after including with includeJS jQuery?

4条回答
何必那么认真
2楼-- · 2019-01-24 01:04

If the json data contains html strings, they will be removed within the suggested page.plainText attribute.

查看更多
Bombasti
3楼-- · 2019-01-24 01:07

As already in the accepted answer, I would suggest using JSON.parse() for converting a JSON string into an object.

For example, your code could look like this:

var jsonObject = page.evaluate(function() {
  return JSON.parse(page.plainText);
});
查看更多
萌系小妹纸
4楼-- · 2019-01-24 01:14

Here is what I did:

var obj = page.evaluate(function() {
    return eval('(' + document.body.innerText + ')');
}

Then the obj you got is the JSON object returned from that page.

查看更多
太酷不给撩
5楼-- · 2019-01-24 01:21

Since you are using PhantomJS which is built of the webkit browser you have access to the native JSON library. There is no need to use page.evaluate, you can just use the plainText property on the page object.

http://phantomjs.org/api/webpage/property/plain-text.html

var page = require('webpage').create();
page.open('http://somejsonpage.com', function () {
    var jsonSource = page.plainText;
    var resultObject = JSON.parse(jsonSource);
    phantom.exit();
});
查看更多
登录 后发表回答