== comparisons against lines from readlines() fail

2019-03-03 19:43发布

问题:

I am currently working on a small anagram program that takes all possible permutations of a word and compare them with a dictionary. However, I am unable to get the results to print. The culprit appears to be the == operator, if I put ''.join(words[i]) == compare[j] nothing prints, however, if I input hi and run the program with ''.join(words[i]) == "hi" the entire dictionary prints, but if I invert it to "hi" == compare[j] nothing prints.

Thanks in advance for any help!

import itertools

run = input("Please enter a word: ")
dictionary = "dictionary.txt" #input("Please enter the name of the dictionary file: ")
txt = open(dictionary)

compare = txt.readlines()

words = (list(itertools.permutations(run)))

for i in range(0, len(words)):
    for j in range(0, len(compare)):
        if ''.join(words[i]) == compare[j]:
            print(compare[j])

回答1:

compare = txt.readlines()

readlines() doesn't strip the line endings from each line, so each line will have a \n at the end. That causes all your comparisons against compare[j] to fail.

You could remove the \n's with something like.

compare = [line.strip() for line in txt]


回答2:

Note that if your word has W letters, and your dictionary has D words, your search is doing W! * D comparisons.

You can reduce this to D comparisons by converting both words to canonical form (ie letters in alphabetical order).

If you are going to search for N words you could reduce it further to D / N comparisons per word (amortized) by storing your dictionary as {canonical_form: [list,of,matching,words]}:

from collections import defaultdict

DICT_FILE = "dictionary.txt"

def canonize(word):
    # "hello\n" => "ehllo"
    return "".join(sorted(word.strip()))

def load_dict(fname=DICT_FILE):
    lookup = defaultdict(list)
    with open(fname) as inf:
        for line in inf:
            word = line.strip()
            canon = canonize(word)
            lookup[canon].append(word)
    # lookup["ehllo"] = ["hello"]
    return lookup

def main():
    anagrams = load_dict()
    while True:
        word = input("Enter word to search for (or hit Enter to quit): ").strip()
        if not word:
            break
        else:
            canon = canonize(word)
            if canon in anagrams:
                print("Found: " + ", ".join(anagrams[canon]))
            else:
                print("No anagrams found.")

if __name__ == "__main__":
    main()

which then runs like

Enter word to search for (or hit Enter to quit): tester
Found: retest, setter, street, tester

Enter word to search for (or hit Enter to quit): binary
Found: binary, brainy

Enter word to search for (or hit Enter to quit): ttt
No anagrams found.

Enter word to search for (or hit Enter to quit): 


回答3:

Replace the newlines from the variable:

compare = compare.replace('\n', '')