I have an ordered dictionary (OrderedDict
) sorted by value. How can I get the top (say 25) key values and add them to a new dictionary?
For example: I have something like this:
dictionary={'a':10,'b':20,'c':30,'d':5}
ordered=OrderedDict(sorted(dictionary.items(), key=lambda x: x[1],reverse=True))
Now ordered
is an ordered dictionary, I want to create a dictionary, say by taking the top 2 most-frequent items and their keys:
frequent={'c':30,'b':20}
The primary purpose of collections.OrderedDict
is retaining the order in which the elements were inserted.
What you want here is collections.Counter
, which has the n-most-frequent functionality built-in:
>>> dictionary={'a':10,'b':20,'c':30,'d':5}
>>> import collections
>>> collections.Counter(dictionary).most_common(2)
[('c', 30), ('b', 20)]
Just make a new dictionary using the first N items (key pairs) in the (reverse) ordered dictionary you already have. For example, to get the top three items you could do something like this:
from collections import OrderedDict
from operator import itemgetter
# create dictionary you have
dictionary = {'a': 10, 'b': 20, 'c': 30, 'd': 5}
ordered = OrderedDict(sorted(dictionary.items(), key=itemgetter(1), reverse=True))
topthree = dict(ordered.items()[:3])
print(topthree) # -> {'a': 10, 'c': 30, 'b': 20}
For Python 3 one could use dict(list(ordered.items())[:3])
since items()
returns an iterator in that version. Alternatively you could use dict(itertools.islice(ordered.items(), 3))
which would work in both Python 2 and 3.
Also note the result is just a regular dictionary—as you specified in your question—not a collections.Counter
or other type of mapping. This approach is very general and doesn't require the original dictionary
to have integer values—just things can be ordered (i.e. compared via the key
function).
Have you tried indexing the List of tuples from the sorted to get the top nth most frequent items and their keys?
For example, if you need the top 2 most frequent items, you might do
dictionary={'a':10,'b':20,'c':30,'d':5}
ordered=dict(sorted(dictionary.items(), key=lambda x: x[1],reverse=True)[:2])
Get the iterator of the items from ordered.iteritems()
method.
Now, to take the first N items, you may use islice
method from itertools
.
>>> import itertools
>>> toptwo = itertools.islice(ordered.iteritems(), 2)
>>> list(toptwo)
[('c', 30), ('b', 20)]
>>>