Pickle a dynamically parameterized sub-class

2020-02-09 02:49发布

问题:

I have a system which commonly stores pickled class types.

I want to be able to save dynamically-parameterized classes in the same way, but I can't because I get a PicklingError on trying to pickle a class which is not globally found (not defined in simple code).

My problem can be modeled as the following example code:

class Base(object):
 def m(self):
  return self.__class__.PARAM

def make_parameterized(param_value):
 class AutoSubClass(Base):
  PARAM = param_value
 return AutoSubClass

cls = make_parameterized(input("param value?"))

When I try to pickle the class, I get the following error:

# pickle.PicklingError: Can't pickle <class '__main__.AutoSubClass'>: it's not found as __main__.AutoSubClass
import pickle
print pickle.dumps(cls)

I am looking for some method to declare Base as a ParameterizableBaseClass which should define the params needed (PARAM in above example). A dynamic parameterized subclass (cls above) should then be picklable by saving the "ParameterizableBaseClass" type and the different param-values (dynamic param_value above).

I am sure that in many cases, this can be avoided altogether... And I can avoid this in my code as well if I really (really) have to. I was playing with __metaclass__, copyreg and even __builtin__.issubclass at some point (don't ask), but was unable to crack this one.

I feel like I wouldn't be true to the python spirit if I wasn't to ask: how can this be achieved, in a relatively clean way?

回答1:

Yes, it is possible -

Whenever you want to custom the Pickle and Unpickle behaviors for your objects, you just have to set the "__getstate__" and "__setstate__" methods on the class itself.

In this case it is a bit trickier: There need, as you observed - to exist a class on the global namespace that is the class of the currently being pickled object: it has to be the same class, with the same name. Ok - the deal is that gthis class existing in the globalname space can be created at Pickle time.

At Unpickle time the class, with the same name, have to exist - but it does not have to be the same object - just behave like it does - and as __setstate__ is called in the Unpickling proccess, it can recreate the parameterized class of the orignal object, and set its own class to be that one, by setting the __class__ attribute of the object.

Setting the __class__ attribute of an object may seen objectionable but it is how OO works in Python and it is officially documented, it even works accross implementations. (I tested this snippet in both Python 2.6 and Pypy)

class Base(object):
    def m(self):
        return self.__class__.PARAM
    def __getstate__(self):
        global AutoSub
        AutoSub = self.__class__
        return (self.__dict__,self.__class__.PARAM)
    def __setstate__(self, state):
        self.__class__ = make_parameterized(state[1])
        self.__dict__.update(state[0])

def make_parameterized(param_value):
    class AutoSub(Base):
        PARAM = param_value
    return AutoSub

class AutoSub(Base):
    pass


if __name__ == "__main__":

    from pickle import dumps, loads

    a = make_parameterized("a")()
    b = make_parameterized("b")()

    print a.PARAM, b.PARAM, type(a) is type(b)
    a_p = dumps(a)
    b_p = dumps(b)

    del a, b
    a = loads(a_p)
    b = loads(b_p)

    print a.PARAM, b.PARAM, type(a) is type(b)


回答2:

I know this is a very old question, but I think it is worth sharing a better means of pickling the parameterised classes than the one that is the currently accepted solution (making the parameterised class a global).

Using the __reduce__ method, we can provide a callable which will return an uninitialised instance of our desired class.

class Base(object):
    def m(self):
        return self.__class__.PARAM

    def __reduce__(self):
        return (_InitializeParameterized(), (self.PARAM, ), self.__dict__)


def make_parameterized(param_value):
    class AutoSub(Base):
        PARAM = param_value
    return AutoSub


class _InitializeParameterized(object):
    """
    When called with the param value as the only argument, returns an 
    un-initialized instance of the parameterized class. Subsequent __setstate__
    will be called by pickle.
    """
    def __call__(self, param_value):
        # make a simple object which has no complex __init__ (this one will do)
        obj = _InitializeParameterized()
        obj.__class__ = make_parameterized(param_value)
        return obj

if __name__ == "__main__":

    from pickle import dumps, loads

    a = make_parameterized("a")()
    b = make_parameterized("b")()

    print a.PARAM, b.PARAM, type(a) is type(b)
    a_p = dumps(a)
    b_p = dumps(b)

    del a, b
    a = loads(a_p)
    b = loads(b_p)

    print a.PARAM, b.PARAM, type(a) is type(b)

It is worth reading the __reduce__ docs a couple of times to see exactly what is going on here.

Hope somebody finds this useful.



回答3:

I guess it's too late now, but pickle is a module I'd rather avoid for anything complex, because it has problems like this one and many more.

Anyways, since pickle wants the class in a global it can have it:

import cPickle

class Base(object):
    def m(self):
        return self.__class__.PARAM

    @classmethod
    def make_parameterized(cls,param):
        clsname = "AutoSubClass.%s" % param
        # create a class, assign it as a global under the same name
        typ = globals()[clsname] = type(clsname, (cls,), dict(PARAM=param))
        return typ

cls = Base.make_parameterized('asd')

import pickle
s = pickle.dumps(cls)

cls = pickle.loads(s)
print cls, cls.PARAM
# <class '__main__.AutoSubClass.asd'> asd

But yeah, you're probably overcomplicating things.



回答4:

Classes that are not created in the top level of a module cannot be pickled, as shown in the Python documentation.

Furthermore, even for an instance of a top level module class the class attributes are not stored. So in your example PARAM wouldn't be stored anyway. (Explained in the Python documentation section linked above as well)