I am trying to scrape various elements in a table from this site to teach myself scraping using node.js, cheerio and request
I have trouble getting the items in the table, essentially I want to get 'rank','company' and '3-year growth' from the table. How do I do this?
Based on an online tutorial, I have developed my scraping.js script to look like this:
var request = require ('request'),
cheerio = require ('cheerio');
request('http://www.inc.com/inc5000/index.html', function (error, response, html) {
if (!error && response.statusCode == 200) {
var $ = cheerio.load(html);
$('tr.ng-scope').each(function(i, element){ //problem probably lies here
var a = $(this).get(0);
console.log(a);
});
}
});
However, I am sure I am not getting the line with comment above right. Is there a way I can access the attributes in the table better?
I notice the Xpaths are as such
//*[@id="col-r"]/table/tbody/tr2/td1 -- rank
//*[@id="col-r"]/table/tbody/tr2/td2/a -- name of company
//*[@id="col-r"]/table/tbody/tr2/td[3] -- 3 year growth rate
Just trying to figure out how to access these attributes accordingly..