I am experiencing slow crawl speeds with scrapy (around 1 page / sec). I'm crawling a major website from aws servers so I don't think its a network issue. Cpu utilization is nowhere near 100 and if I start multiple scrapy processes crawl speed is much faster.
Scrapy seems to crawl a bunch of pages, then hangs for several seconds, and then repeats.
I've tried playing with: CONCURRENT_REQUESTS = CONCURRENT_REQUESTS_PER_DOMAIN = 500
but this doesn't really seem to move the needle past about 20.
Are you sure you are allowed to crawl the destination site at high speed? Many sites implement download threshold and "after a while" start responding slowly.