I'm building a download manager in python for fun, and sometimes the connection to the server is still on but the server doesn't send me data, so read method (of HTTPResponse) block me forever. This happens, for example, when I download from a server, which located outside of my country, that limit the bandwidth to other countries.
How can I set a timeout for the read method (2 minutes for example)?
Thanks, Nir.
5 years later but hopefully this will help someone else...
I was wrecking my brain trying to figure this out. My problem was a server returning corrupt content and thus giving back less data than it thought it had.
I came up with a nasty solution that seems to be working properly. Here it goes:
NOTE This solution also works for
the python requestsANY library that implements the normal python sockets (which should be all of them?). You just have to go a few levels deeper:As of this writing, I have not tried the following but in theory it should work:
Explanation
I stumbled upon this approach when reading this SO question for setting a timeout on socket.recv
At the end of the day, any http request has a socket. For the httplib that socket is located at
resp.raw._fp.fp._sock.socket
. Theresp.raw._fp.fp._sock
is asocket._fileobj
(which I honestly didn't look far into) and I imagine it'ssettimeout
method internally sets it on thesocket
attribute.You have to set it during HTTPConnection initialization.
Note: in case you are using an older version of Python, then you can install httplib2; by many, it is considered a superior alternative to httplib, and it does supports timeout.
I've never used it, though, and I'm just reporting what documentation and blogs are saying.
Setting the default timeout might abort a download early if it's large, as opposed to only aborting if it stops receiving data for the timeout value. HTTPlib2 is probably the way to go.
If you're stuck on some Python version
< 2.6
, one (imperfect but usable) approach is to dobefore you start using
httplib
. The docs are here, and clearly state thatsetdefaulttimeout
is available since Python 2.3 -- every socket made from the time you do this call, to the time you call the same function again, will use that timeout of 10 seconds. You can use getdefaulttimeout before setting a new timeout, if you want to save the previous timeout (including none) so that you can restore it later (with anothersetdefaulttimeout
).These functions and idioms are quite useful whenever you need to use some older higher-level library which uses Python
socket
s but doesn't give you a good way to set timeouts (of course it's better to use updated higher-level libraries, e.g. thehttplib
version that comes with 2.6 or the third-partyhttplib2
in this case, but that's not always feasible, and playing with the default timeout setting can be a good workaround).