Connection to other side was lost in a non-clean f

2019-07-16 06:51发布

from scrapy.spider import BaseSpider

class dmozSpider(BaseSpider):
    name = "dmoz"
    allowed_domains = ["dmoz.org"]
    start_urls = [
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
    ]

    def parse(self, response):
        filename = response.url.split("/")[-2]
        open(filename, 'wb').write(response.body)

then I run "scrapy crawl dmoz" then I got this error:

2013-09-14 13:20:56+0700 [dmoz] DEBUG: Retrying http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/> (failed 1 times): Connection to other side was lost in a non-clean fashion.

Does anyone know how to fix this?

标签： python scrapy

1条回答

戒情不戒烟

2楼-- · 2019-07-16 07:27

You need to check your internet connection or if you're using proxy, set your environment variables for proxy authentication.

In windows, try these steps:

Win+R type 'systempropertiesadvanced' (without quote)
Click "Environment Variables..." button
Add 2 new variables (either user/system variable is fine):

name        | value
------------+--------------------------------  
HTTP_PROXY  | http://username:password@host:port 
HTTPS_PROXY | https://username:password@host:port

alternative way: setting-proxy-env

0人赞添加讨论(0) 举报

Connection to other side was lost in a non-clean f

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间