How do I write a representer for PyYAML?

2019-04-07 08:00发布

问题:

I want to have a custom function that serializes arbitrary python objects, like how the json.dump function has an optional arg called 'default', that should be a function that the json dumper will call if the object is not json serializable.

I simply want to do the equivalent of this from the json package.

json.dump(tests_dump, file('somefile', 'w+'), default = lambda x: x.__dict__)

It looks like I need to write yaml.add_representer, from the PyYAML docs, but it really isn't clear how to do this.

回答1:

Here is a sample for add_representer. Not sure if this is exactly what you want. Nevertheless ...

import yaml

#Arbitrary Class
class MyClass:
  def __init__(self, someNumber, someString):
    self.var1 = someNumber
    self.var2 = someString

#define the representer, responsible for serialization
def MyClass_representer(dumper, data):
    serializedData = str(data.var1) + "|" + data.var2
    return dumper.represent_scalar('!MyClass', serializedData )

#'register' it     
yaml.add_representer(MyClass, MyClass_representer)

obj = MyClass(100,'test')

print ( 'original Object\nvar1:{0}, var2:{1}\n'.format(obj.var1, obj.var2) )

#serialize
yamlData = yaml.dump(obj)

print('serialized as:\n{0}'.format(yamlData) )

#Now to deserialize you need a constructor
def MyClass_constructor(loader,node):
    value = loader.construct_scalar(node)
    someNumber,sep,someString = value.partition("|")
    return MyClass(someNumber,someString)

#'register' it    
yaml.add_constructor('!MyClass', MyClass_constructor)

#deserialize
obj2 = yaml.load(yamlData)

print ( 'after deserialization\nvar1:{0}, var2:{1}\n'.format(obj2.var1, obj2.var2) )

Of course there is code duplication and the code is not optimized. You can make these two functions part of your Class, and also implement __repr__ to get a printable representation that you can use to populate serializedData in MyClass_representer