Running more than one spiders one by one

2019-09-08 02:04发布

I am using Scrapy framework to make spiders crawl through some webpages. Basically, what I want is to scrap web pages and save them to database. I have one spider per webpage. But I am having trouble to run those spiders at once such that a spider starts to crawl exactly after another spiders finishes crawling. How can that be achieved? Is scrapyd the solution?

标签： python scrapy scrapyd

1条回答

叼着烟拽天下

2楼-- · 2019-09-08 02:38

scrapyd is indeed a good way to go, max_proc or max_proc_per_cpu configuration can be used to restrict the number of parallel spdiers, you will then schedule spiders using scrapyd rest api like:

$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider

0人赞添加讨论(0) 举报

Running more than one spiders one by one

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间