twisted.internet.error.ConnectError when run scrap

2019-07-20 07:16发布

I'm using scrapy to run a spider and get the following errors:

DEBUG: Retrying http://xixichengyuanlc.fang.com/esf/> (failed 2 times): An error occurred while connecting: [Failure instance: Traceback (failure with no frames): : Connection to the other side was lost in a non-clean fashion: Connection lost.

I have ever successfully run this spider for several times but I want to use some user agents to run faster and get the errors above. At first I thought there might be something wrong with my user agents, so I checked but still can't figure out.And then I want to try the former spider again but still get the same errors.

below is my settings.py

    # Scrapy settings for soufang project

    SPIDER_MODULES = ['soufang.spiders']
    NEWSPIDER_MODULE = 'soufang.spiders'
    DEFAULT_ITEM_CLASS = 'soufang.items.Community_info'


    ITEM_PIPELINES = ['soufang.pipelines.MySQLStorePipeline']
    #DOWNLOADER_MIDDLEWARES={
    #'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
    #'soufang.misc.middlewares.CustomUserAgentMiddleware':400}

1条回答
干净又极端
2楼-- · 2019-07-20 08:18

The ITEM_PIPELINES setting is not a list, but a dict:

ITEM_PIPELINES = {
    'soufang.pipelines.MySQLStorePipeline': 100
}

Other than that, I can't say what's wrong exactly. I don't see you have set USER_AGENT in your settings? Also, paste the full log.

查看更多
登录 后发表回答