从网址下载HTML文件时超时错误(Timeout error when downloading .h

2019-10-23 04:11发布

我得到下载从URL的HTML页面时,下面的错误。

Error: raise URLError(err) urllib2.URLError: <urlopen error [Errno
 10060] A connection attempt failed because the connected party did not
 properly respond after a period of time or established connection
 failed because connected host has failed to respond>

码:

import urllib2 
hdr = {'User-Agent': 'Mozilla/5.0'}

for i,site in enumerate(urls[index]):
    print (site)
    req = urllib2.Request(site, headers=hdr)
    page = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(req)
    page_content = page.read()
    with open(path_current+'/'+str(i)+'.html', 'w') as fid:
        fid.write(page_content)

我想这可能是由于一些代理设置或更改超时,但我不知道。 请帮帮忙,我手工检查网址似乎打开完美的罚款。

Answer 1:

好了,因为它不会发生在你身上的大部分时间,我可以推断出你的网络可能是缓慢的。 尝试设置超时以下列方式:

req = urllib2.Request(site, headers=hdr)
timeout_in_sec = 360
page = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(req, timeout=timeout_in_sec)
page_content = page.read()


文章来源: Timeout error when downloading .html files from urls