Is there any JavaScript web crawler framework?
相关问题
- Is there a limit to how many levels you can nest i
- How to toggle on Order in ReactJS
- void before promise syntax
- Keeping track of variable instances
- Can php detect if javascript is on or not?
Server-side?
Try node-crawler: https://github.com/joshfire/node-crawler
Try the PhantomJS. Not exactly a crawler, but could be easily used for that purpose. It has the fully-functional WebKit engine built-in, with an ability to save screenshots etc. Works as the simple command-line JS interpreter.
There's a new framework that was just release for Node.js called spider. It uses jQuery under the hood to crawl/index a website's HTML pages. The API and configuration are really nice especially if you already know jQuery.
From the test suite, here's an example of crawling the New York Times website: