pickling class method

2019-01-28 07:50发布

I have a class whose instances need to format output as instructed by the user. There's a default format, which can be overridden. I implemented it like this:

class A:
  def __init__(self, params):
    # ...
    # by default printing all float values as percentages with 2 decimals
    self.format_functions = {float: lambda x : '{:.2%}'.format(x)}
  def __str__(self):
    # uses self.format_functions to format output
    # ...

a = A(params)
print(a) # uses default output formatting

# overriding default output formatting
# float printed as percentages 3 decimal digits; bool printed as Y / N
a.format_functions = {float : lambda x: '{:.3%}'.format(x),
                      bool : lambda x: 'Y' if x else 'N'}
print(a)

Is it ok? Let me know if there is a better way to design this.

Unfortunately, I need to pickle instances of this class. But only functions defined at the top level of the module can be pickled; lambda functions are unpicklable, so my format_functions instance attribute breaks the pickling.

I tried rewriting this to use a class method instead of lambda functions, but still no luck for the same reason:

class A:
  @classmethod
  def default_float_format(cls, x):
    return '{:.2%}'.format(x)
  def __init__(self, params):
    # ...
    # by default printing all float values as percentages with 2 decimals
    self.format_functions = {float: self.default_float_format}
  def __str__(self):
    # uses self.format_functions to format output
    # ...

a = A(params)
pickle.dump(a) # Can't pickle <class 'method'>: attribute lookup builtins.method failed

Note that pickling here doesn't work even if I don't override the defaults; just the fact that I assigned self.format_functions = {float : self.default_float_format} breaks it.

What to do? I'd rather not pollute the namespace and break encapsulation by defining default_float_format at the module level.

Incidentally, why in the world does pickle create this restriction? It certainly feels like a gratuitous and substantial pain to the end user.

2条回答
啃猪蹄的小仙女
2楼-- · 2019-01-28 08:24

For pickling of class instances or functions (and therefore methods), Python's pickle depend that their name is available as global variables - the reference to the method in the dictionary points to a name that is not available in the global name space - which iis better said "module namespace" -

You could circunvent that by customizing the pickling of your class, by creating teh "__setstate__" and "__getstate__" methods - but I think you be better, since the formatting function does not depend on any information of the object or of the class itself (and even if some formatting function does, you could pass that as parameters), and define a function outside of the class scope.

This does work (Python 3.2):

def default_float_format( x):
    return '{:.2%}'.format(x)

class A:

  def __init__(self, params):
    # ...
    # by default printing all float values as percentages with 2 decimals
    self.format_functions = {float: default_float_format}
  def __str__(self):
    # uses self.format_functions to format output
    pass

a = A(1)
pickle.dumps(a)
查看更多
趁早两清
3楼-- · 2019-01-28 08:35

If you use the dill module, either of your two approaches will just "work" as is. dill can pickle lambda as well as instances of classes and also class methods.

No need to pollute the namespace and break encapsulation, as you said you didn't want to do… but the other answer does.

dill is basically ten years or so worth of finding the right copy_reg function that registers how to serialize the majority of objects in standard python. Nothing special or tricky, it just takes time. So why doesn't pickle do this for us? Why does pickle have this restriction?

Well, if you look at the pickle docs, the answer is there: https://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled

Basically: Functions and classes are pickled by reference.

This means pickle does not work on objects defined in __main__, and it also doesn't work on many dynamically modified objects. dill registers __main__ as a module, so it has a valid namespace. dill also given you the option to not pickle by reference, so you can serialize dynamically modified objects… and class instances, class methods (bound and unbound), and so on.

查看更多
登录 后发表回答