I'm struggling to pickle a wrapped function when I use a custom callable class as a wrapper.
I have a callable class "Dependee" that keeps track of dependencies for a wrapped function with a member variable "depends_on". I'd like to use a decorator to wrap functions and also be able to pickle the resulting wrapped function.
So I define my dependee class. Note the use of functools.update_wrapper.
>>> class Dependee:
...
... def __init__(self, func, depends_on=None):
... self.func = func
... self.depends_on = depends_on or []
... functools.update_wrapper(self, func)
...
... def __call__(self, *args, **kwargs):
... return self.func(*args, **kwargs)
...
Then I define my decorator such that it will return an instance of the Dependee wrapper class.
>>> class depends:
...
... def __init__(self, on=None):
... self.depends_on = on or []
...
... def __call__(self, func):
... return Dependee(func, self.depends_on)
...
Here's an example of a wrapped function.
>>> @depends(on=["foo", "bar"])
... def sum(x, y): return x+y
...
The member variable seems to be accessible.
>>> print(sum.depends_on)
['foo', 'bar']
I can call the function as expected.
>>> print(sum(1,2))
3
But I can't pickle the wrapped instance.
>>> print(pickle.dumps(sum))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <function sum at 0x7f543863fbf8>: it's not the same object as __main__.sum
What am I missing? How can I give pickle a more appropriately qualified name so that it can find the instance rather than the original function. Note that manual wrapping works just fine.
>>> def sum2_func(x,y): return x+y
...
>>> sum2 = Dependee(sum2_func, depends_on=["foo", "bar"])
>>> print(sum2.depends_on)
['foo', 'bar']
>>> print(sum2(1,2))
3
>>> print(pickle.loads(pickle.dumps(sum2)).depends_on)
['foo', 'bar']
Yep, well-known pickle
problem -- can't pickle functions or classes that can't just be retrieved by their name in the module. See e.g https://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule for clear examples (specifically on how this affects modwsgi
, but also of the issue in general).
In this case since all you're doing is adding attributes to the function, you can get away with a simplified approach:
class depends:
def __init__(self, on=None):
self.depends_on = on or []
def __call__(self, func):
func.func = func
func.depends_on = self.depends_on or []
return func
the return func
is the key idea -- return the same object that's being decorated (possibly after decorating it, as here, with additional attributes -- but, not a different object, else the name-vs-identity issue perks up).
Now this will work (just your original code, only changing depends
as above):
$ python d.py
['foo', 'bar']
3
c__main__
sum
p0
.
Of course, this isn't a general-purpose solution (it only works if it's feasible for the decorator to return the same object it's decorating), just one that works in your example.
I am not aware of any serialization approach able to serialize and de-serialize Python objects without this limitation, alas.
You just need a better serializer, like dill
. As for how it works, dill
just does a lot of registering python types with the equivalent of copy_reg
-- it also treats __main__
similarly to a module, and lastly can serialize by reference or by object. So the last bit is relevant if you want to serialize a function or class, and take the class/function definition with the pickle. It's a little bigger of a pickle than serializing by reference, but it's more robust.
Here's your code exactly:
>>> import dill
>>> import functors
>>> class Dependee:
... def __init__(self, func, depends_on=None):
... self.func = func
... self.depends_on = depends_on or []
... functools.update_wrapper(self, func)
... def __call__(self, *args, **kwargs):
... return self.func(*args, **kwargs)
...
>>>
>>> class depends:
... def __init__(self, on=None):
... self.depends_on = on or []
... def __call__(self, func):
... return Dependee(func, self.depends_on)
...
>>> @depends(on=['foo','bar'])
... def sum(x,y): return x+y
...
>>> print(sum.depends_on)
['foo', 'bar']
>>> print(sum(1,2))
3
>>> _sum = dill.dumps(sum)
>>> sum_ = dill.loads(_sum)
>>> print(sum_(1,2))
3
>>> print(sum_.depends_on)
['foo', 'bar']
>>>
Get dill
here: https://github.com/uqfoundation