Executing scraped JavaScript with cheerio

2019-02-22 10:16发布

I have a web page in which there are some JS APIs that don't alter the dom, but return some numbers. I'd like to write a NodeJS application that downloads such pages and executes those functions in the context of the downloaded page.

I was looking at cheerio for page scraping.. but while I see how easy is it to navigate and manipulate the DOM with it, I don't see any access to running the page functions. Is it possible to do it?

Should I look, instead, at jsdom?

Thanks

2条回答
霸刀☆藐视天下
2楼-- · 2019-02-22 11:01

Cheerio and jsdom are both HTML scrapers and have no notion of executing JavaScript. If the API you wish to access is written in JavaScript, there is little to prevent you from extracting them and running them inside node. Beware though, downloading/executing arbitrary JavaScript can pose a huge security risk. If you want to simulate the behaviour of a browser, look at http://phantomjs.org/. This is a headless browser for Node and can do everything an ordinary browser can as well.

查看更多
萌系小妹纸
3楼-- · 2019-02-22 11:22

Sounds like you want to use PhantomJS, which will provide the fully rendered output, and then use cheerio on that.

查看更多
登录 后发表回答