is it possible to write web crawler in javascript?-第2页回答

I want to crawl the page and check for the hyperlinks in that respective page and also follow those hyperlinks and capture data from the page

标签： javascript web-crawler

10条回答

老娘就宠你

2楼-- · 2019-03-10 16:55

This is what you need http://zugravu.com/products/web-crawler-spider-scraping-javascript-regular-expression-nodejs-mongodb They use NodeJS, MongoDB and ExtJs as GUI

0人赞添加讨论(0) 举报

Melony?

3楼-- · 2019-03-10 16:56

We could crawl the pages using Javascript from server side with help of headless webkit. For crawling, we have few libraries like PhantomJS, CasperJS, also there is a new wrapper on PhantomJS called Nightmare JS which make the works easier.

0人赞添加讨论(0) 举报

傲

4楼-- · 2019-03-10 17:02

yes it is possible

Use NODEJS (its server side JS)
There is NPM (package manager that handles 3rd party modules) in nodeJS
Use PhantomJS in NodeJS (third party module that can crawl through websites is PhantomJS)

0人赞添加讨论(0) 举报

何必那么认真

5楼-- · 2019-03-10 17:03

There are ways to circumvent the same-origin policy with JS. I wrote a crawler for facebook, that gathered information from facebook profiles from my friends and my friend's friends and allowed filtering the results by gender, current location, age, martial status (you catch my drift). It was simple. I just ran it from console. That way your script will get privilage to do request on the current domain. You can also make a bookmarklet to run the script from your bookmarks.

Another way is to provide a PHP proxy. Your script will access the proxy on current domain and request files from another with PHP. Just be carefull with those. These might get hijacked and used as a public proxy by 3rd party if you are not carefull.

Good luck, maybe you make a friend or two in the process like I did :-)

0人赞添加讨论(0) 举报

上一页 1 2

is it possible to write web crawler in javascript?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间