How do you control how the order in which PyYaml outputs key/value pairs when serializing a Python dictionary?
I'm using Yaml as a simple serialization format in a Python script. My Yaml serialized objects represent a sort of "document", so for maximum user-friendliness, I'd like my object's "name" field to appear first in the file. Of course, since the value returned by my object's __getstate__
is a dictionary, and Python dictionaries are unordered, the "name" field will be serialized to a random location in the output.
e.g.
>>> import yaml
>>> class Document(object):
... def __init__(self, name):
... self.name = name
... self.otherstuff = 'blah'
... def __getstate__(self):
... return self.__dict__.copy()
...
>>> doc = Document('obj-20111227')
>>> print yaml.dump(doc, indent=4)
!!python/object:__main__.Document
otherstuff: blah
name: obj-20111227
Took me a few hours of digging through PyYAML docs and tickets, but I eventually discovered this comment that lays out some proof-of-concept code for serializing an OrderedDict as a normal YAML map (but maintaining the order).
e.g. applied to my original code, the solution looks something like:
>>> import yaml
>>> from collections import OrderedDict
>>> def dump_anydict_as_map(anydict):
... yaml.add_representer(anydict, _represent_dictorder)
...
>>> def _represent_dictorder( self, data):
... if isinstance(data, Document):
... return self.represent_mapping('tag:yaml.org,2002:map', data.__getstate__().items())
... else:
... return self.represent_mapping('tag:yaml.org,2002:map', data.items())
...
>>> class Document(object):
... def __init__(self, name):
... self.name = name
... self.otherstuff = 'blah'
... def __getstate__(self):
... d = OrderedDict()
... d['name'] = self.name
... d['otherstuff'] = self.otherstuff
... return d
...
>>> dump_anydict_as_map(Document)
>>> doc = Document('obj-20111227')
>>> print yaml.dump(doc, indent=4)
!!python/object:__main__.Document
name: obj-20111227
otherstuff: blah
Cerin, Thanks a lot for your answer and it helped me the resolve my issue. But it took me sometime to understand the answer as there was no input dictionary mentioned. So, I'm re-posting @cerin's answer with the input dictionary. Here output is displayed as a separate entries. So, this approach is good for recursively dumping data to a yaml file in a predefined order.
import yaml
input_dict = {"first_key": "fist_value", "second_key": "second_value", "third_key": "third_value"}
from collections import OrderedDict
def dump_anydict_as_map(anydict):
yaml.add_representer(anydict, _represent_dictorder)
def _represent_dictorder( self, data):
if isinstance(data, Document):
return self.represent_mapping('tag:yaml.org,2002:map', data.__getstate__().items())
else:
return self.represent_mapping('tag:yaml.org,2002:map', data.items())
class Document(object):
def __init__(self, name): # no need to preserve the order here
self.first_key = input_dict["first_key"]
self.second_key = input_dict["second_key"]
self.third_key = input_dict["third_key"]
def __getstate__(self): # this is where order should be defined
d = OrderedDict()
d['second_key'] = self.second_key
d['third_key'] = self.third_key
d['first_key'] = self.first_key
return d
dump_anydict_as_map(Document)
doc = Document('obj-20111227')
print(yaml.dump([doc], default_flow_style=False))
Output
- second_key: second_value
third_key: third_value
first_key: fist_value
The last time I checked, Python's dictionaries weren't ordered. If you really want them to be, I strongly recommend using a list of key/value pairs.
[
('key', 'value'),
('key2', 'value2')
]
Alternatively, define a list with the keys and put them in the right order.
keys = ['key1', 'name', 'price', 'key2'];
for key in keys:
print obj[key]