A timeout decorator class with multiprocessing giv

2019-07-11 02:04发布

So on windows the signal and the thread approahc in general are bad ideas / don't work for timeout of functions.

I've made the following timeout code which throws a timeout exception from multiprocessing when the code took to long. This is exactly what I want.

 def timeout(timeout, func, *arg):
    with Pool(processes=1) as pool:
        result = pool.apply_async(func, (*arg,))
        return result.get(timeout=timeout)

I'm now trying to get this into a decorator style so that I can add it to a wide range of functions, especially those where external services are called and I have no control over the code or duration. My current attempt is below:

class TimeWrapper(object):

    def __init__(self, timeout=10):
        """Timing decorator"""
        self.timeout = timeout

    def __call__(self, f):
        def wrapped_f(*args):
            with Pool(processes=1) as pool:
                result = pool.apply_async(f, (*args,))
                return result.get(timeout=self.timeout)

        return wrapped_f

It gives a pickling error:

@TimeWrapper(7)
def func2(x, y):
    time.sleep(5)
    return x*y

File "C:\Users\rmenk\AppData\Local\Continuum\anaconda3\lib\multiprocessing\reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <function func2 at 0x000000770C8E4730>: it's not the same object as __main__.func2

I'm suspecting this is due to the multiprocessing and the decorator not playing nice but I don't actually know how to make them play nice. Any ideas on how to fix this?

PS: I've done some extensive research on this site and other places but haven't found any answers that work, be it with pebble, threading, as a function decorator or otherwise. If you have a solution that you know works on windows and python 3.5 I'd be very happy to just use that.

2条回答
在下西门庆
2楼-- · 2019-07-11 02:29

The only problem You have here is that You tested the decorated function in the main context. Move it out to a different module and it will probably work.

I wrote the wrapt_timeout_decorator what uses wrapt & dill & multiprocess & pipes versus pickle & multiprocessing & queue, because it can serialize more datatypes.

It might look simple at first, but under windows a reliable timeout decorator is quite tricky - You might use mine, its quite mature and tested :

https://github.com/bitranox/wrapt_timeout_decorator

On Windows the main module is imported again (but with a name != 'main') because Python is trying to simulate a forking-like behavior on a system that doesn't support forking. multiprocessing tries to create an environment similar to Your main process by importing the main module again with a different name. Thats why You need to shield the entry point of Your program with the famous " if name == 'main': "

import lib_foo

def some_module():
    lib_foo.function_foo()

def main():
    some_module()

# here the subprocess stops loading, because __name__ is NOT '__main__'
if __name__ = '__main__':
    main()

This is a problem of Windows OS, because the Windows Operating System does not support "fork"

You can find more information on that here:

Workaround for using __name__=='__main__' in Python multiprocessing

https://docs.python.org/2/library/multiprocessing.html#windows

Since main.py is loaded again with a different name but "main", the decorated function now points to objects that do not exist anymore, therefore You need to put the decorated Classes and functions into another module. In general (especially on windows) , the main() program should not have anything but the main function, the real thing should happen in the modules. I am also used to put all settings or configurations in a different file - so all processes or threads can access them (and also to keep them in one place together, not to forget typing hints and name completion in Your favorite editor)

The "dill" serializer is able to serialize also the main context, that means the objects in our example are pickled to "main.lib_foo", "main.some_module","main.main" etc. We would not have this limitation when using "pickle" with the downside that "pickle" can not serialize following types:

functions with yields, nested functions, lambdas, cell, method, unboundmethod, module, code, methodwrapper, dictproxy, methoddescriptor, getsetdescriptor, memberdescriptor, wrapperdescriptor, xrange, slice, notimplemented, ellipsis, quit

additional dill supports:

save and load python interpreter sessions, save and extract the source code from functions and classes, interactively diagnose pickling errors

To support more types with the decorator, we selected dill as serializer, with the small downside that methods and classes can not be decorated in the main context, but need to reside in a module.

You can find more information on that here: Serializing an object in __main__ with pickle or dill

查看更多
放荡不羁爱自由
3楼-- · 2019-07-11 02:37

What you are trying to achieve is particularly cumbersome in Windows. The core issue is that when you decorate a function, you shadow it. This happens to work just fine in UNIX due to the fact it uses the fork strategy to create a new process.

In Windows though, the new process will be a blank one where a brand new Python interpreter is started and loads your module. When the module gets loaded, the decorator hides the real function making it hard to find for the pickle protocol.

The only way to get it right is to rely on a trampoline function to be set during the decoration. You can take a look on how is done on pebble but, as long as you're not doing it for an exercise, I'd recommend to use pebble directly as it already offers what you are looking for.

from pebble import concurrent

@concurrent.process(timeout=60)
def my_function(var, keyvar=0):
    return var + keyvar

future = my_function(1, keyvar=2)
future.result()
查看更多
登录 后发表回答