So, I have this index as a dict.
index = {'Testfil2.txt': ['nisse', 'hue', 'abe', 'pind'], 'Testfil1.txt': ['hue', 'abe',
'tosse', 'svend']}
I need to invert the index so it will be a dict with duplicates of values merged into one key with the 2 original keys as values, like this:
inverse = {'nisse' : ['Testfil2.txt'], 'hue' : ['Testfil2.txt', 'Testfil1.txt'],
'abe' : ['Testfil2.txt', 'Testfil1.txt'], 'pind' : ['Testfil2.txt'], 'tosse' :
['Testfil1.txt'], 'svend' : ['Testfil1.txt']
Yes, I typed the above by hand.
My textbook has this function for inverting dictionaries:
def invert_dict(d):
inverse = dict()
for key in d:
val = d[key]
if val not in inverse:
inverse[val] = [key]
else:
inverse[val].append(key)
return inverse
It works fine for simple key:value pairs
BUT, when I try that function with a dict that has lists as values such as my index
I get this error message:
invert_dict(index)
Traceback (most recent call last):
File "<pyshell#153>", line 1, in <module>
invert_dict(index)
File "<pyshell#150>", line 5, in invert_dict
if val not in inverse:
TypeError: unhashable type: 'list'
I have searched for an hour looking for a solution, the book is no help, and I suspect that I can use tuples in some way, but I am not sure how. Any help?
I've tried around and you want to use val not in inverse
but it can't be checked if a "list is in a dict". (val
is a list)
For your code a simple change will do what you want:
def invert_dict(d):
inverse = dict()
for key in d:
# Go through the list that is saved in the dict:
for item in d[key]:
# Check if in the inverted dict the key exists
if item not in inverse:
# If not create a new list
inverse[item] = [key]
else:
inverse[item].append(key)
return inverse
My solution for reverse a dictionary , how ever It creates a new dictionary new_dic
:
new_dic = {}
for k,v in index.items():
for x in v:
new_dic.setdefault(x,[]).append(k)
Output :
{'tosse': ['Testfil1.txt'], 'nisse': ['Testfil2.txt'], 'svend': ['Testfil1.txt'], 'abe': ['Testfil1.txt', 'Testfil2.txt'], 'pind': ['Testfil2.txt'], 'hue': ['Testfil1.txt', 'Testfil2.txt']}
You can not use list
objects as dictionary keys, since they should be hashable objects. You can loop over your items and use dict.setdefault
method to create the expected result:
>>> new = {}
>>>
>>> for k,value in index.items():
... for v in value:
... new.setdefault(v,[]).append(k)
...
>>> new
{'hue': ['Testfil2.txt', 'Testfil1.txt'], 'svend': ['Testfil1.txt'], 'abe': ['Testfil2.txt', 'Testfil1.txt'], 'tosse': ['Testfil1.txt'], 'pind': ['Testfil2.txt'], 'nisse': ['Testfil2.txt']}
and if you are dealing with larger datasets for refusing of calling creating an empty list at each calling the setdefault()
method you can use collections.defaultdict()
which will calls the missing function just when it encounter a new key.
from collections import defaultdict
new = defaultdict(list)
for k,value in index.items():
for v in value:
new[v].append(k)
>>> new
defaultdict(<type 'list'>, {'hue': ['Testfil2.txt', 'Testfil1.txt'], 'svend': ['Testfil1.txt'], 'abe': ['Testfil2.txt', 'Testfil1.txt'], 'tosse': ['Testfil1.txt'], 'pind': ['Testfil2.txt'], 'nisse': ['Testfil2.txt']})