Pickling decorated callable class wrapper

2019-04-10 19:33发布

问题:

I'm struggling to pickle a wrapped function when I use a custom callable class as a wrapper.

I have a callable class "Dependee" that keeps track of dependencies for a wrapped function with a member variable "depends_on". I'd like to use a decorator to wrap functions and also be able to pickle the resulting wrapped function.

So I define my dependee class. Note the use of functools.update_wrapper.

>>> class Dependee:
...     
...     def __init__(self, func, depends_on=None):
...         self.func = func
...         self.depends_on = depends_on or []
...         functools.update_wrapper(self, func)
...         
...     def __call__(self, *args, **kwargs):
...         return self.func(*args, **kwargs)
... 

Then I define my decorator such that it will return an instance of the Dependee wrapper class.

>>> class depends:
...     
...     def __init__(self, on=None):
...         self.depends_on = on or []
...     
...     def __call__(self, func):
...         return Dependee(func, self.depends_on)
... 

Here's an example of a wrapped function.

>>> @depends(on=["foo", "bar"])
... def sum(x, y): return x+y
... 

The member variable seems to be accessible.

>>> print(sum.depends_on)
['foo', 'bar']

I can call the function as expected.

>>> print(sum(1,2))
3

But I can't pickle the wrapped instance.

>>> print(pickle.dumps(sum))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <function sum at 0x7f543863fbf8>: it's not the same object as __main__.sum

What am I missing? How can I give pickle a more appropriately qualified name so that it can find the instance rather than the original function. Note that manual wrapping works just fine.

>>> def sum2_func(x,y): return x+y
... 
>>> sum2 = Dependee(sum2_func, depends_on=["foo", "bar"])
>>> print(sum2.depends_on)
['foo', 'bar']
>>> print(sum2(1,2))
3
>>> print(pickle.loads(pickle.dumps(sum2)).depends_on)
['foo', 'bar']

回答1:

Yep, well-known pickle problem -- can't pickle functions or classes that can't just be retrieved by their name in the module. See e.g https://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule for clear examples (specifically on how this affects modwsgi, but also of the issue in general).

In this case since all you're doing is adding attributes to the function, you can get away with a simplified approach:

class depends:

def __init__(self, on=None):
    self.depends_on = on or []

def __call__(self, func):
    func.func = func
    func.depends_on = self.depends_on or []
    return func

the return func is the key idea -- return the same object that's being decorated (possibly after decorating it, as here, with additional attributes -- but, not a different object, else the name-vs-identity issue perks up).

Now this will work (just your original code, only changing depends as above):

$ python d.py 
['foo', 'bar']
3
c__main__
sum
p0
.

Of course, this isn't a general-purpose solution (it only works if it's feasible for the decorator to return the same object it's decorating), just one that works in your example.

I am not aware of any serialization approach able to serialize and de-serialize Python objects without this limitation, alas.



回答2:

You just need a better serializer, like dill. As for how it works, dill just does a lot of registering python types with the equivalent of copy_reg -- it also treats __main__ similarly to a module, and lastly can serialize by reference or by object. So the last bit is relevant if you want to serialize a function or class, and take the class/function definition with the pickle. It's a little bigger of a pickle than serializing by reference, but it's more robust.

Here's your code exactly:

>>> import dill
>>> import functors
>>> class Dependee:
...   def __init__(self, func, depends_on=None):
...     self.func = func
...     self.depends_on = depends_on or []
...     functools.update_wrapper(self, func)
...   def __call__(self, *args, **kwargs):
...     return self.func(*args, **kwargs)
... 
>>>       
>>> class depends:
...   def __init__(self, on=None):
...     self.depends_on = on or []
...   def __call__(self, func):
...     return Dependee(func, self.depends_on)
... 
>>> @depends(on=['foo','bar'])
... def sum(x,y): return x+y
... 
>>> print(sum.depends_on)
['foo', 'bar']
>>> print(sum(1,2))
3
>>> _sum = dill.dumps(sum)
>>> sum_ = dill.loads(_sum)
>>> print(sum_(1,2))
3
>>> print(sum_.depends_on)
['foo', 'bar']
>>> 

Get dill here: https://github.com/uqfoundation