Anyone know of a good Python based web crawler tha-第2页回答

Anyone know of a good Python based web crawler tha

2019-01-03 11:03发布

I'm half-tempted to write my own, but I don't really have enough time right now. I've seen the Wikipedia list of open source crawlers but I'd prefer something written in Python. I realize that I could probably just use one of the tools on the Wikipedia page and wrap it in Python. I might end up doing that - if anyone has any advice about any of those tools, I'm open to hearing about them. I've used Heritrix via its web interface and I found it to be quite cumbersome. I definitely won't be using a browser API for my upcoming project.

Thanks in advance. Also, this is my first SO question!

标签： python web-crawler

8条回答

何必那么认真

2楼-- · 2019-01-03 11:49

I've used Ruya and found it pretty good.

0人赞添加讨论(0) 举报

够拽才男人

3楼-- · 2019-01-03 12:06

Another simple spider Uses BeautifulSoup and urllib2. Nothing too sophisticated, just reads all a href's builds a list and goes though it.

0人赞添加讨论(0) 举报

上一页 1 2

Anyone know of a good Python based web crawler tha

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间