This question already has an answer here:
Python 3:
I have a list of words. I want to create a dict that contains unique words from the list as key and their frequency as value. Also I want to remove any words that are substrings of other words from the list.
For eg:
list = ['goon ', 'goonk ', 'goon ', 'goonj ', 'w ', 'wo ', 'wor ', 'world ', 'world ']
Dictionary should be :
dict = {'goonj': 1, 'world': 2, 'goonk':1}
I have tried the following methods but I do not get the desired dict.
Method 1: If the substring list for key is empty then I add that key to the dict.
for keyword in list:
if not [key for key in list if keyword in key and key != keyword]:
if keyword in KeywordDict:
KeywordDict[keyword] += 1
else:
KeywordDict[keyword] = 1
Method 2: Add the word to dict and remove all it's substring keys from dict.
if keyword in KeywordDict:
KeywordDict[keyword] += 1
else:
KeywordDict[keyword] = 1
for key in KeywordDict:
if keyword.startswith(key) > -1:
KeywordDict.pop(key)
I have tried other minor variations of the above methods but the list still contains words that are substrings.
The actual list has about 300 words.
I have also tried using list comprehension and dict comprehension with the same bug.
What am I doing wrong? Can someone suggest an alternative approach?