In Python, how can you load YAML mappings as Order

2019-01-03 08:30发布

I'd like to get PyYAML's loader to load mappings (and ordered mappings) into the Python 2.7+ OrderedDict type, instead of the vanilla dict and the list of pairs it currently uses.

What's the best way to do that?

9条回答
该账号已被封号
2楼-- · 2019-01-03 08:43

The yaml module allow you to specify custom 'representers' to convert Python objects to text and 'constructors' to reverse the process.

_mapping_tag = yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG

def dict_representer(dumper, data):
    return dumper.represent_dict(data.iteritems())

def dict_constructor(loader, node):
    return collections.OrderedDict(loader.construct_pairs(node))

yaml.add_representer(collections.OrderedDict, dict_representer)
yaml.add_constructor(_mapping_tag, dict_constructor)
查看更多
做个烂人
3楼-- · 2019-01-03 08:43

There is a PyYAML ticket on the subject opened 5 years ago. It contains some relevant links, including the link to this very question :) I personally grabbed gist 317164 and modified it a little bit to use OrderedDict from Python 2.7, not the included implementation (just replaced the class with from collections import OrderedDict).

查看更多
在下西门庆
4楼-- · 2019-01-03 08:44

Update: the library was deprecated in favor of the yamlloader (which is based on the yamlordereddictloader)

I've just found a Python library (https://pypi.python.org/pypi/yamlordereddictloader/0.1.1) which was created based on answers to this question and is quite simple to use:

import yaml
import yamlordereddictloader

datas = yaml.load(open('myfile.yml'), Loader=yamlordereddictloader.Loader)
查看更多
霸刀☆藐视天下
5楼-- · 2019-01-03 08:47

here's a simple solution that also checks for duplicated top level keys in your map.

import yaml
import re
from collections import OrderedDict

def yaml_load_od(fname):
    "load a yaml file as an OrderedDict"
    # detects any duped keys (fail on this) and preserves order of top level keys
    with open(fname, 'r') as f:
        lines = open(fname, "r").read().splitlines()
        top_keys = []
        duped_keys = []
        for line in lines:
            m = re.search(r'^([A-Za-z0-9_]+) *:', line)
            if m:
                if m.group(1) in top_keys:
                    duped_keys.append(m.group(1))
                else:
                    top_keys.append(m.group(1))
        if duped_keys:
            raise Exception('ERROR: duplicate keys: {}'.format(duped_keys))
    # 2nd pass to set up the OrderedDict
    with open(fname, 'r') as f:
        d_tmp = yaml.load(f)
    return OrderedDict([(key, d_tmp[key]) for key in top_keys])
查看更多
Juvenile、少年°
6楼-- · 2019-01-03 08:48

Note: there is a library, based on the following answer, which implements also the CLoader and CDumpers: Phynix/yamlloader

I doubt very much that this is the best way to do it, but this is the way I came up with, and it does work. Also available as a gist.

import yaml
import yaml.constructor

try:
    # included in standard lib from Python 2.7
    from collections import OrderedDict
except ImportError:
    # try importing the backported drop-in replacement
    # it's available on PyPI
    from ordereddict import OrderedDict

class OrderedDictYAMLLoader(yaml.Loader):
    """
    A YAML loader that loads mappings into ordered dictionaries.
    """

    def __init__(self, *args, **kwargs):
        yaml.Loader.__init__(self, *args, **kwargs)

        self.add_constructor(u'tag:yaml.org,2002:map', type(self).construct_yaml_map)
        self.add_constructor(u'tag:yaml.org,2002:omap', type(self).construct_yaml_map)

    def construct_yaml_map(self, node):
        data = OrderedDict()
        yield data
        value = self.construct_mapping(node)
        data.update(value)

    def construct_mapping(self, node, deep=False):
        if isinstance(node, yaml.MappingNode):
            self.flatten_mapping(node)
        else:
            raise yaml.constructor.ConstructorError(None, None,
                'expected a mapping node, but found %s' % node.id, node.start_mark)

        mapping = OrderedDict()
        for key_node, value_node in node.value:
            key = self.construct_object(key_node, deep=deep)
            try:
                hash(key)
            except TypeError, exc:
                raise yaml.constructor.ConstructorError('while constructing a mapping',
                    node.start_mark, 'found unacceptable key (%s)' % exc, key_node.start_mark)
            value = self.construct_object(value_node, deep=deep)
            mapping[key] = value
        return mapping
查看更多
【Aperson】
7楼-- · 2019-01-03 08:50

On my For PyYaml installation for Python 2.7 I updated __init__.py, constructor.py, and loader.py. Now supports object_pairs_hook option for load commands. Diff of changes I made is below.

__init__.py

$ diff __init__.py Original
64c64
< def load(stream, Loader=Loader, **kwds):
---
> def load(stream, Loader=Loader):
69c69
<     loader = Loader(stream, **kwds)
---
>     loader = Loader(stream)
75c75
< def load_all(stream, Loader=Loader, **kwds):
---
> def load_all(stream, Loader=Loader):
80c80
<     loader = Loader(stream, **kwds)
---
>     loader = Loader(stream)

constructor.py

$ diff constructor.py Original
20,21c20
<     def __init__(self, object_pairs_hook=dict):
<         self.object_pairs_hook = object_pairs_hook
---
>     def __init__(self):
27,29d25
<     def create_object_hook(self):
<         return self.object_pairs_hook()
<
54,55c50,51
<         self.constructed_objects = self.create_object_hook()
<         self.recursive_objects = self.create_object_hook()
---
>         self.constructed_objects = {}
>         self.recursive_objects = {}
129c125
<         mapping = self.create_object_hook()
---
>         mapping = {}
400c396
<         data = self.create_object_hook()
---
>         data = {}
595c591
<             dictitems = self.create_object_hook()
---
>             dictitems = {}
602c598
<             dictitems = value.get('dictitems', self.create_object_hook())
---
>             dictitems = value.get('dictitems', {})

loader.py

$ diff loader.py Original
13c13
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
18c18
<         BaseConstructor.__init__(self, **constructKwds)
---
>         BaseConstructor.__init__(self)
23c23
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
28c28
<         SafeConstructor.__init__(self, **constructKwds)
---
>         SafeConstructor.__init__(self)
33c33
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
38c38
<         Constructor.__init__(self, **constructKwds)
---
>         Constructor.__init__(self)
查看更多
登录 后发表回答