Pickle all attributes except one

2019-02-21 09:34发布

What is the best way to write a __getstate__ method that pickles almost all of an object's attributes, but excludes a few?

I have an object with many properties, including one that references an instancemethod. instancemethod's are not pickleable, so I'm getting an error when I try to pickle this object:

class Foo(object):
    def __init__(self):
        self.a = 'spam'
        self.b = 'eggs'
        self.c = 42
        self.fn = self.my_func
    def my_func(self):
        print 'My hovercraft is full of eels'

import pickle
pickle.dumps(Foo())              # throws a "can't pickle instancemethod objects" TypeError

This __getstate__ method fixes this, but then I have to manually include all the properties I want to serialize:

def __getstate__(self):
    return { 'a': self.a, 'b': self.b, 'c': self.c }

That's not very scalable or maintainable if I have an object with many attributes or that changes frequently.

The only alternative I can think of is some kind of helper function that iterates through an object's properties and adds them (or not) to the dictionary, based on the type.

标签: python pickle
6条回答
We Are One
2楼-- · 2019-02-21 09:58

The only alternative I can think of is some kind of helper function that iterates through an object's properties and adds them (or not) to the dictionary, based on the type.

Yeah, I think that's pretty much what you're left with, if you want enough "magic" to allow yourself to be lazy (and/or allow for dynamically added attributes). Keep in mind that "pickle can't handle this" isn't the only reason you might not want to include something in the pickled state.

But it's not as hard as you seem to think, assuming you have code for the "should I pickle this?" logic:

def __getstate__(self):
  return dict((k, v) for (k, v) in self.__dict__.iteritems() if should_pickle(v))
查看更多
走好不送
3楼-- · 2019-02-21 10:01

I'd cut to the root of your problem, and try to serialize the so-called 'un-pickleable' items first. To do this, I'd use dill, which can serialize almost anything in python. Dill also has some good tools for helping you understand what is causing your pickling to fail when your code fails.

>>> import dill
>>> dill.loads(dill.dumps(your_bad_object))
>>> ...
>>> # if you get a pickling error, use dill's tools to figure out a workaround
>>> dill.detect.badobjects(your_bad_object, depth=0)
>>> dill.detect.badobjects(your_bad_object, depth=1)
>>> ...

If you absolutely wanted to, you could use dill's badobjects (or one of the other detection functions) to dive recursively into your object's reference chain, and pop out the unpickleable objects, instead of calling it at at every depth, as above.

查看更多
相关推荐>>
4楼-- · 2019-02-21 10:05

For the your specific case (preventing a function from getting pickled), use this:

self.__class__.fn = self.__class__.my_func

Now, instead of adding a function to an instance of a class, you've added it to the class itself, thus the function won't get pickled. This won't work if you want each instance to have its own version of fn.

My scenario was that I wanted to selectively add get_absolute_url to some Django models, and I wanted to define this in an abstract BaseModel class. I had self.get_absolute_url = … and ran into the pickle issue. Just added __class__ to the assignment solved the issue in my case.

查看更多
你好瞎i
5楼-- · 2019-02-21 10:10

Using is_instance_method from an earlier answer:

def __getstate__(self):
    return dict((k, v) for k, v in self.__dict__.iteritems()
                       if not is_instance_method(getattr(self, k)))

Although the is_instance_method operation can also be performed less "magically" by taking an known instance method, say my_func, and taking its type.

def __getstate__(self):
    instancemethod = type(self.my_func)
    return dict((k, v) for k, v in self.__dict__.iteritems()
                       if not isinstance(getattr(self, k), instancemethod))
查看更多
成全新的幸福
6楼-- · 2019-02-21 10:19

You could always just remove the bad items:

def __getstate__(self):
    state = self.__dict__
    del state[...]
    return state
查看更多
手持菜刀,她持情操
7楼-- · 2019-02-21 10:20

__slots__ solution

If you are using slots, you can avoid repeating members to exclude with:

class C(object):
    _pickle_slots = ['i']
    __slots__ = _pickle_slots + ['j']
    def __init__(self, i, j):
        self.i = i
        self.j = j
    def __getstate__(self):
        return (None, {k:getattr(self, k) for k in C._pickle_slots })

o = pickle.loads(pickle.dumps(C(1, 2), -1))

# i is there
assert o.i == 1

# j was excluded
try:
    o.j
except:
    pass
else:
    raise

Tested in Python 2.7.6.

查看更多
登录 后发表回答