可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I've just inherited some Python code and need to fix a bug as soon as possible. I have very little Python knowledge so please excuse my ignorance. I am using urllib2 to extract data from web pages. Despite using socket.setdefaulttimeout(30) I am still coming across URLs that hang seemingly indefinitely.

I want to time out the extraction and have got this far after much searching the web:

import socket 
socket.setdefaulttimeout(30)

reqdata = urllib2.Request(urltocollect)

    def handler(reqdata):
        ????  reqdata.close() ????


    t = Timer(5.0, handler,[reqdata])
    t.start()
    urldata = urllib2.urlopen(reqdata)
    t.cancel()

The handler function triggers after the time has passed but I don't know how to get it to stop the openurl operation.

Any guidance would be gratefully received. C

UPDATE ------------------------- In my experience when used on certain URLs urllib2.urlopen hangs and waits indefinitely. The URLs that do this are ones that when pointed to with a browser never resolve, the browser just waits with the activity indicator moving but never connecting fully. I suspect that these URLs may be stuck inside some kind of infinite looping redirect. The timeout argument to urlopen (in later versions of Python) and the socket.setdefaulttimeout() global setting do not detect this issue on my system.

I tried a number of solutions but in the end I updraded to Python 2.7 and used a variation of Werner’s answer below. Thanks Werner.

回答1:

You can achieve this using signals.

Here's an example of my signal decorator that you can use to set the timeout for individual functions.

Ps. not sure if this is syntactically correct for 2.4. I'm using 2.6 but the 2.4 supports signals.

import signal
import time

class TimeOutException(Exception):
    pass

def timeout(seconds, *args, **kwargs):
    def fn(f):
        def wrapped_fn(*args, **kwargs):
            signal.signal(signal.SIGALRM, handler)
            signal.alarm(seconds)
            f(*args, **kwargs)
        return wrapped_fn
    return fn

def handler(signum, frame):
    raise TimeOutException("Timeout")

@timeout(5)
def my_function_that_takes_long(time_to_sleep):
    time.sleep(time_to_sleep)

if __name__ == '__main__':
    print 'Calling function that takes 2 seconds'
    try:
        my_function_that_takes_long(2)
    except TimeOutException:
        print 'Timed out'

    print 'Calling function that takes 10 seconds'
    try:
        my_function_that_takes_long(10)
    except TimeOutException:
        print 'Timed out'

回答2:

It's right there in the function.

urllib2.urlopen(url[, data][, timeout])

e.g:

urllib2.urlopen("www.google.com", data, 5)

Timing out urllib2 urlopen operation in Python 2.4

问题:

回答1:

回答2:

收藏的人(0)

Timing out urllib2 urlopen operation in Python 2.4

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮