Remove items from list that are substrings of othe

2019-09-02 01:52发布

This question already has an answer here:

Python 3:

I have a list of words. I want to create a dict that contains unique words from the list as key and their frequency as value. Also I want to remove any words that are substrings of other words from the list.

For eg:

list  =   ['goon ', 'goonk ', 'goon ', 'goonj ', 'w ', 'wo ', 'wor ', 'world ', 'world ']

Dictionary should be :

dict  =  {'goonj': 1, 'world': 2, 'goonk':1}

I have tried the following methods but I do not get the desired dict.

Method 1: If the substring list for key is empty then I add that key to the dict.

for keyword in list:         
    if not [key for key in list if keyword in key and key != keyword]:
        if keyword in KeywordDict:
            KeywordDict[keyword] += 1
        else:
            KeywordDict[keyword] = 1

Method 2: Add the word to dict and remove all it's substring keys from dict.

if keyword in KeywordDict:
    KeywordDict[keyword] += 1
else:
    KeywordDict[keyword] = 1
for key in KeywordDict:
    if keyword.startswith(key) > -1:
        KeywordDict.pop(key)

I have tried other minor variations of the above methods but the list still contains words that are substrings.

The actual list has about 300 words.

I have also tried using list comprehension and dict comprehension with the same bug.

What am I doing wrong? Can someone suggest an alternative approach?

0条回答
登录 后发表回答