I'm gathering statistics on a list of websites and I'm using requests for it for simplicity. Here is my code:
data=[]
websites=['http://google.com', 'http://bbc.co.uk']
for w in websites:
r= requests.get(w, verify=False)
data.append( (r.url, len(r.content), r.elapsed.total_seconds(), str([(l.status_code, l.url) for l in r.history]), str(r.headers.items()), str(r.cookies.items())) )
Now, I want requests.get
to timeout after 10 seconds so the loop doesn't get stuck.
This question has been of interest before too but none of the answers are clean. I will be putting some bounty on this to get a nice answer.
I hear that maybe not using requests is a good idea but then how should I get the nice things requests offer. (the ones in the tuple)
timeout = (connection timeout, data read timeout) or give a single argument(timeout=1)
If it comes to that, create a watchdog thread that messes up requests' internal state after 10 seconds, e.g.:
Note that depending on the system libraries you may be unable to set deadline on DNS resolution.
To create a timeout you can use signals.
The best way to solve this case is probably to
try-except-finally
block.Here is some example code:
There are some caveats to this:
But, it's all in the standard python library! Except for the sleep function import it's only one import. If you are going to use timeouts many places You can easily put the TimeoutException, _timeout and the singaling in a function and just call that. Or you can make a decorator and put it on functions, see the answer linked below.
You can also set this up as a "context manager" so you can use it with the
with
statement:One possible down side with this context manager approach is that you can't know if the code actually timed out or not.
Sources and recommended reading:
Try this request with timeout & error handling:
What about using eventlet? If you want to timeout the request after 10 seconds, even if data is being received, this snippet will work for you:
Set the timeout parameter:
As long as you don't set
stream=True
on that request, this will cause the call torequests.get()
to timeout if the connection takes more than ten seconds, or if the server doesn't send data for more than ten seconds.