I am trying to get some information from twitter using CasperJS. And I'm stuck with infinite scroll. The thing is that even using jquery to scroll the page down nothings seems to work. Neither scrolling, neither triggering the exact event on window
(smth like uiNearTheBottom) doesn't seem to help.
Interesting thing - all of these attempts work when injecting JS code via js console in FF & Chrome.
Here's the example code :
casper.thenEvaluate(function(){
$(window).trigger('uiNearTheBottom');
});
or
casper.thenEvaluate(function(){
document.body.scrollTop = document.body.scrollHeight;
});
If casper.scrollToBottom() fails you or casper.scroll_to_bottom(), then the one below will serve you:
this.page.scrollPosition = { top: this.page.scrollPosition["top"] +
document.body.scrollHeight, left: 0 };
A working example:
casper.start(url, function () {
this.wait(10000, function () {
this.page.scrollPosition = { top: this.page.scrollPosition["top"] + document.body.scrollHeight, left: 0 };
if (this.visible("div.load-more")) {
this.echo("I am here");
}
})});
It uses the underlying PhantomJS scroll found here
CasperJs is based on PhantomJS and as per below discussion no window object exist for the headless browser.
You can check the discussion here
On Twitter you can use:
casper.scrollToBottom();
casper.wait(1000, function () {
casper.capture("loadedContent.png");
});
But if you include jQuery... , the above code won't work!
var casper = require('casper').create({
clientScripts: [
'jquery-1.11.0.min.js'
]
});
The script injection blocks Twitter's infinite scroll from loading content. On BoingBoing.net, CasperJS scrollToBottom() works with jQuery without blocking. It really depends on the site.
However, you can inject jQuery after the content has loaded.
casper.scrollToBottom();
casper.wait(1000, function () {
casper.capture("loadedContent.png");
// Inject client-side jQuery library
casper.options.clientScripts.push("jquery.js");
// And use like so...
var height = casper.evaluate(function () {
return $(document).height();
});
});
I have adopted this from a previous answer
var iterations = 5; //amount of pages to go through
var timeToWait = 2000; //time to wait in milliseconds
var last;
var list = [];
for (i = 0; i <= iterations; i++) {
list.push(i);
}
//evaluate this in the browser context and pass the timer back to casperjs
casper.thenEvaluate(function(iters, waitTime) {
window.x = 0;
var intervalID = setInterval(function() {
console.log("Using setInternal " + window.x);
window.scrollTo(0, document.body.scrollHeight);
if (++window.x === iters) {
window.clearInterval(intervalID);
}
}, waitTime);
}, iterations, timeToWait);
casper.each(list, function(self, i) {
self.wait(timeToWait, function() {
last = i;
this.echo('Using this.wait ' + i);
});
});
casper.waitFor(function() {
return (last === list[list.length - 1] && iterations === this.getGlobal('x'));
}, function() {
this.echo('All done.')
});
Essentially what happens is I enter the page context, scroll to the bottom, and then wait 2 seconds for the content to load. Obviously I would have liked to use repeated applications of casper.scrollToBottom()
or something more sophisticated, but the loading time wasn't allowing me to make this happen.