Sorting a nested OrderedDict by key, recursively

2020-06-01 07:27发布

问题:

Say orig is an OrderedDict which contains normal string:string key value pairs, but sometimes the value could be another, nested OrderedDict.

I want to sort orig by key, alphabetically (ascending), and do it recursively.

Rules:

  • Assume key strings are unpredictable
  • Assume nesting can take place infinitely, e.g. level 1-50 all have both strings, OrderedDicts, etc as values.

Need an assist with the sorted algorithm:

import string
from random import choice


orig = OrderedDict((
    ('a', choice(string.digits)),
    ('b', choice(string.digits)),
    ('c', choice(string.digits)),
    ('special', OrderedDict((
        ('a', choice(string.digits)),
        ('b', choice(string.digits)),
        ('c', choice(string.digits)),
    )))
))

sorted_copy = OrderedDict(sorted(orig.iteritems(), ...))

self.assertEqual(orig, sorted_copy)

回答1:

EDIT: for python 3.6+, @pelson's answer is better

something like:

def sortOD(od):
    res = OrderedDict()
    for k, v in sorted(od.items()):
        if isinstance(v, dict):
            res[k] = sortOD(v)
        else:
            res[k] = v
    return res


回答2:

@acushner's solution can now be simplified in python3.6+ as dictionaries now preserve their insertion order.

Given we can now use the standard dictionary, the code now looks like:

def order_dict(dictionary):
    result = {}
    for k, v in sorted(dictionary.items()):
        if isinstance(v, dict):
            result[k] = order_dict(v)
        else:
            result[k] = v
    return result

Because we can use standard dictionaries, we can also use standard dictionary comprehensions, so the code boils down to:

def order_dict(dictionary):
    return {k: order_dict(v) if isinstance(v, dict) else v
            for k, v in sorted(dictionary.items())}

See also https://mail.python.org/pipermail/python-dev/2016-September/146327.html for detail on python's ordered dictionary implementation. Also, the pronouncement that this will be a language feature as of python 3.7: https://mail.python.org/pipermail/python-dev/2017-December/151283.html



回答3:

Very similar to @acushner's solution, but class-based:

from collections import OrderedDict


class SortedDict(OrderedDict):

    def __init__(self, **kwargs):
        super(SortedDict, self).__init__()

        for key, value in sorted(kwargs.items()):
            if isinstance(value, dict):
                self[key] = SortedDict(**value)
            else:
                self[key] = value

Usage:

sorted_dict = SortedDict(**unsorted_dict)


回答4:

I faced a very similar issue with getting a stable object so I could get a stable hash, except I had objects with a mix of lists and dictionaries, so I had to sort all the dictionaries, depth first, and then sort the lists. This extends @acushner's answer:

def deep_sort(obj):
    if isinstance(obj, dict):
        obj = OrderedDict(sorted(obj.items()))
        for k, v in obj.items():
            if isinstance(v, dict) or isinstance(v, list):
                obj[k] = deep_sort(v)

    if isinstance(obj, list):
        for i, v in enumerate(obj):
            if isinstance(v, dict) or isinstance(v, list):
                obj[i] = deep_sort(v)
        obj = sorted(obj, key=lambda x: json.dumps(x))

    return obj

As a side point, if you find yourself with classes in your objects that you need to sort, you can jsonpickle.dumps() them, then json.loads() them, then deep_sort() them. If it matters, then you can always json.dumps() and jsonpickle.loads() to get back to where you started, except sorted (well, only sorted in Python 3.6+). For cases of a stable hash, that wouldn't be necessary though.