Declare data type to ruamel.yaml so that it can re

2019-08-09 11:01发布

I am using a function from a python library which returns an object with a specific data type. I would like to serialize that object to a yaml file and I would like to use ruamel.yaml. The problem is that ruamel.yaml does not know how to serialize the specific data type that the function returns and throws an exception:

RepresenterError: cannot represent an object: <...>

The question is how to "declare" the data type to ruamel.yaml so that it knows how to handle it.

Note: I can't / I don't want to make changes to the library or anything of that sort. I am only the consumer of an API.

To make this more concrete, let's use the following example that uses socket.AF_INET which happens to be an IntEnum but the specific data type should not be important.

import sys
import socket

import ruamel.yaml

def third_party_lib():
    """ Return a dict with our data """
    return {"AF_INET": socket.AF_INET}

yaml = ruamel.yaml.YAML(typ="safe", pure=True)
yaml.dump(third_party_lib(), sys.stdout)

which gives this error:

    ruamel.yaml.YAML.dump(self, data, stream, **kw)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/main.py", line 439, in dump
    return self.dump_all([data], stream, _kw, transform=transform)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/main.py", line 453, in dump_all
    self._context_manager.dump(data)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/main.py", line 801, in dump
    self._yaml.representer.represent(data)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/representer.py", line 84, in represent
    node = self.represent_data(data)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/representer.py", line 111, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/representer.py", line 359, in represent_dict
    return self.represent_mapping(u'tag:yaml.org,2002:map', data)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/representer.py", line 222, in represent_mapping
    node_value = self.represent_data(item_value)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/representer.py", line 121, in represent_data
    node = self.yaml_representers[None](self, data)
  File "/home/feanor/Prog/git/vps-bench/.direnv/python-venv-3.7.2/lib/python3.7/site-packages/ruamel/yaml/representer.py", line 392, in represent_undefined
    raise RepresenterError('cannot represent an object: %s' % data)
ruamel.yaml.representer.RepresenterError: cannot represent an object: AddressFamily.AF_INET

1条回答
Juvenile、少年°
2楼-- · 2019-08-09 11:37

In order for ruamel.yaml to be able to dump a specific class, whether you define it, you get it from the standard library or get if from somewhere else, you need to register that class against the representer. (This is not necessary when using YAML(typ='unsafe'), but I assume you don't want to resort to that).

This registration can be done in different ways. Assuming you have done yaml = ruamel.yaml.YAML() or yaml = ruamel.yaml.YAML(typ='safe'), and want to represent SomeClass, you can:

  • use yaml.register_class(SomeClass). This might work on other classes depending on how they are defined.
  • use one of the decorators @yaml_object(yaml) or @yaml.register_class, just before the class SomeClass: definition. This is primarily of use when defining your own classes
  • add a representer directly using: yaml.representer.add_representer(SomeClass, some_class_to_yaml)

The first two ways are just syntactic sugar wrapped around the third way, and they will try to use a method to_yaml and a class attribute yaml_tag if available, and try to do something sensible if either is not available.

You can try yaml.register(socket.AF_INET), but you'll notice that it fails because:

AttributeError: 'AddressFamily' object has no attribute 'name'

So you'll have to resort to the third way using add_representer(). The argument some_class_to_yaml is a function that will be called when a SomeClass instance is encountered, and that function is called with the yaml.representer instance as first argument and with the actual data (the instance of SomeClass) as second argument.

If SomeClass is some container type that could recursively reference itself (indirectly), you need to take special care dealing with that possibility, but for socket.AF_INET this is not necessary.

The specific data type is in so far important, that you need to decide how to represent the type in YAML. Quiet often you'll see that that attributes of SomeClass are used as keys in a mapping (and then it is the mapping that gets the tag), but sometimes the type can be directly represented in a non-collection type available in YAML such as a string, int, etc., for other classes it makes more sense to be represented as a (tagged) sequence.

When you print type(socket.AF_INET), you'll notice that "SomeClass" is actually AddressFamily. And after inspecting socket.AF_INET using dir(), you'll notice that there is a name attribute and that nicely gives you a string 'AF_INET', which can be used to tell the representer how to represent this data as a string, without resorting to some lookup:

import sys
import socket
import ruamel.yaml


def repr_socket(representer, data):
    return representer.represent_scalar(u'!socket', data.name)

yaml = ruamel.yaml.YAML()
yaml.representer.add_representer(socket.AddressFamily, repr_socket)

data = dict(sock=socket.AF_INET)
yaml.dump(data, sys.stdout)

which gives:

sock: !socket AF_INET

Make sure the tag is defined as unicode (necessary in case you are using Python 2.7).

If you also want to load this, you can extend the constructor in an similar way. But this time you'll get a Node that you need to convert to AddressFamily instance.

yaml_str = """\
- !socket AF_INET
- !socket AF_UNIX
"""

def constr_socket(constructor, node):
    return getattr(socket, node.value)

yaml.constructor.add_constructor(u'!socket', constr_socket)
data = yaml.load(yaml_str)

assert data[0] == socket.AF_INET
assert data[1] == socket.AF_UNIX

which runs without throwing an exception, and shows that the other constants in socket are handled as well.

查看更多
登录 后发表回答