Recently, I have been asked to make "our C++ lib work in the cloud".
Basically, the lib is computer intensive (calculating prices), so it would make sense.
I have constructed a SWIG interface to make a python version with in the mind to use MapReduce with MRJob.
I wanted to serialize the objects in a file, and using a mapper, deserialize and calculate the price.
For example:
class MRTest(MRJob):
def mapper(self,key,value):
obj = dill.loads(value)
yield (key, obj.price())
But now I reach a dead end since it seems that dill cannot handle SWIG extension:
PicklingError: Can't pickle <class 'SwigPyObject'>: it's not found as builtins.SwigPyObject
Is there a way to make this work properly?
I'm the dill
author. That's correct, dill
can't pickle C++ objects. When you see it's not found as builtin.
some_object… that almost invariably means that you are trying to pickle some object that is not written in python, but uses python to bind to C/C++ (i.e. an extension type). You have no hope of directly pickling such objects with a python serializer.
However, since you are interested in pickling a subclass of an extension type, you can actually do it. All you will need to do is to give your object the appropriate state you want to save as an instance attribute or attributes, and provide a __reduce__
method to tell dill
(or pickle
) how to save the state of your object. This method is how python deals with serializing extension types. See:
https://docs.python.org/2/library/pickle.html#pickling-and-unpickling-extension-types
There are probably better examples, but here's at least one example:
https://stackoverflow.com/a/19874769/4646678