Python urllib2: Cannot assign requested address

2020-03-31 08:26发布

I am sending thousands of requests using urllib2 with proxies. I have received many of the following error on execution:

urlopen error [Errno 99] Cannot assign requested address

I read here that it may be due to a socket already being bonded. Is that the case? Any suggestions on how to fix this?

3条回答
不美不萌又怎样
2楼-- · 2020-03-31 08:52

As mhawke suggested, the issue of TIME_WAIT seems most likely. The system wide fix for your situation can be to adjust kernel parameters so such connections are cleaned up more often. Two options:

$ sysctl net.ipv4.tcp_tw_recycle=1

This will let the kernel reuse connections in TIME_WAIT state. This may cause issues with NAT setups. Another one is:

$ sysctl net.ipv4.tcp_max_orphans=8192
$ sysctl net.ipv4.tcp_orphan_retries=1

This tells the kernel to keep at most 8192 connections not attached to any user process and only retry once before killing TCP connections.

Note that these are not permanent changes. Add the setting to /etc/sysctl.conf to make them permanent.

http://code.google.com/p/lusca-cache/issues/detail?id=89#c4
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.kernel.obscure.html

查看更多
叼着烟拽天下
3楼-- · 2020-03-31 08:58

Here is an answer to a similar looking question that I prepared earlier.... much earlier... Socket in use error when reusing sockets

The error is different, but the underlying problem is probably the same: you are consuming all available ports and trying to reuse them before the TIME_WAIT state has ended.

[EDIT: in response to comments]

If it is within the capability/spec for your application, one obvious strategy is to control the rate of connections to avoid this situation.

Alternatively, you could use the httplib module. httplib.HTTPConnection() lets you specify a source_address tuple with which you can specify the port from which to make the connection, e.g. this will connect to localhost:1234 from localhost:9999:

import httplib
conn = httplib.HTTPConnection('localhost:1234', source_address=('localhost',9999))
conn.request('GET', '/index.html')

Then it is a matter of managing the source port assignment as described in my earlier answer. If you are on Windows you can use this method to get around the default range of ports 1024-5000.

There is (of course), an upper limit to how many connections you are going to be able to make and it is questionable what sort of an application would require making thousands of connections in rapid succession.

查看更多
放我归山
4楼-- · 2020-03-31 09:03

I have had a similar issue but was using POST command using python's request library though!! To make it worse, I used multiprocessing over each executor to post to a server. So thousands of connections created in seconds that took few seconds each to change the state from TIME_WAIT and release the ports for the next set of connections.

Out of all the available solutions available over the internet that speak of disabling keep-alive, using with request.Session() et al, I found this answer to be working which makes use of 'Connection' : 'close' configuration as header parameter. You may need to put the header content in a separte line outside the post command though.

headers = {
        'Connection': 'close'
}
with requests.Session() as session:
response = session.post('https://xx.xxx.xxx.x/xxxxxx/x', headers=headers, files=files, verify=False)
results = response.json()
print results

Just give it a try with request library.

查看更多
登录 后发表回答