CasperJS can not trigger twitter infinite scroll

2019-03-27 15:26发布

问题:

I am trying to get some information from twitter using CasperJS. And I'm stuck with infinite scroll. The thing is that even using jquery to scroll the page down nothings seems to work. Neither scrolling, neither triggering the exact event on window (smth like uiNearTheBottom) doesn't seem to help. Interesting thing - all of these attempts work when injecting JS code via js console in FF & Chrome. Here's the example code :

casper.thenEvaluate(function(){
    $(window).trigger('uiNearTheBottom');
});

or

casper.thenEvaluate(function(){
    document.body.scrollTop  =  document.body.scrollHeight;
});

回答1:

If casper.scrollToBottom() fails you or casper.scroll_to_bottom(), then the one below will serve you:

this.page.scrollPosition = { top: this.page.scrollPosition["top"] + document.body.scrollHeight, left: 0 };

A working example:

casper.start(url, function () {
 this.wait(10000, function () {
    this.page.scrollPosition = { top: this.page.scrollPosition["top"] + document.body.scrollHeight, left: 0 };
    if (this.visible("div.load-more")) {
        this.echo("I am here");
    }
})});

It uses the underlying PhantomJS scroll found here



回答2:

CasperJs is based on PhantomJS and as per below discussion no window object exist for the headless browser.

You can check the discussion here



回答3:

On Twitter you can use:

casper.scrollToBottom();
casper.wait(1000, function () {
    casper.capture("loadedContent.png");
});

But if you include jQuery... , the above code won't work!

var casper = require('casper').create({
    clientScripts: [
        'jquery-1.11.0.min.js'
    ]
});

The script injection blocks Twitter's infinite scroll from loading content. On BoingBoing.net, CasperJS scrollToBottom() works with jQuery without blocking. It really depends on the site.

However, you can inject jQuery after the content has loaded.

casper.scrollToBottom();
casper.wait(1000, function () {
    casper.capture("loadedContent.png");

    // Inject client-side jQuery library
    casper.options.clientScripts.push("jquery.js");

    // And use like so...
    var height = casper.evaluate(function () {
        return $(document).height();
    });
});


回答4:

I have adopted this from a previous answer

var iterations = 5; //amount of pages to go through
var timeToWait = 2000; //time to wait in milliseconds

var last;
var list = [];

for (i = 0; i <= iterations; i++) {
    list.push(i);
}

//evaluate this in the browser context and pass the timer back to casperjs
casper.thenEvaluate(function(iters, waitTime) {
    window.x = 0;
    var intervalID = setInterval(function() {
        console.log("Using setInternal " + window.x);
        window.scrollTo(0, document.body.scrollHeight); 

        if (++window.x === iters) {
            window.clearInterval(intervalID);
        }
    }, waitTime);
}, iterations, timeToWait);

casper.each(list, function(self, i) {

    self.wait(timeToWait, function() {
        last = i;
        this.echo('Using this.wait ' + i);
    });

});

casper.waitFor(function() {
    return (last === list[list.length - 1] && iterations === this.getGlobal('x'));
}, function() {
    this.echo('All done.')
});

Essentially what happens is I enter the page context, scroll to the bottom, and then wait 2 seconds for the content to load. Obviously I would have liked to use repeated applications of casper.scrollToBottom() or something more sophisticated, but the loading time wasn't allowing me to make this happen.