Load a web page, execute its JavaScript and dump r

2019-02-27 04:25发布

问题:

I need to load a web page, execute its JavaScript (and all js files included with the tags) and dump resulting HTLM to a file. This needs to be done on the server. I have tried node.js with zombie.js but it seems it is too immature to work in the real world. More often than not it just throws a bogus exception while a real browser (FireFox) has no issues with the page.

My node.js code is:

var zombie = require("zombie"),
    sys = require('sys');

// Load the page
var browser = new zombie.Browser({ debug: false });
browser.visit('http://www.dba.dk', function (error, browser, status) {
    if (error) { console.log('Error:' + error.message); }
    if (!error && browser.statusCode == 200) {
        sys.puts(browser.html);
    }
});

and it exits with an exception "TypeError: Cannot call method 'toString' of null"

Jaxer is not really an option.. I need to download a 3rd party page and execute it on my server. How would I do that with Jaxer

回答1:

Perhaps that’s because you are using err.message whereas err is not defined? error, on the other hand, is defined.


Update

Did you check out PhantomJS?

Also, it looks like Aptana Jaxer could do what you want. To quote John Resig:

Imagine ripping off the visual rendering part of Firefox and replacing it with a hook to Apache instead - roughly speaking that's what Jaxer is.