I'm trying to make a method which can check whether a given phrase matches at least one item from list of phrases and returns them. Input is the phrase, a list of phrases and a dictionary of lists of synonyms. The point is to make it universal.
Here is the example:
phrase = 'This is a little house'
dictSyns = {'little':['small','tiny','little'],
'house':['cottage','house']}
listPhrases = ['This is a tiny house','This is a small cottage','This is a small building','I need advice']
I can create a code which can do that on this example which returns bool:
if any('This'+' '+'is'+' '+'a'+x+' '+y == phrase for x in dictSyns['little'] for y in dictSyns['house']):
print 'match'
The first point is that I have to create the function which would be universal (depends on results). The second is that I want this function to returns list of matched phrases.
Can you give me an advice how to do that so the method returns ['This is a tiny house','This is a small cottage']
in this case?
The output would be like:
>>> getMatches(phrase, dictSyns, listPhrases)
['This is a tiny house','This is a small cottage']
I went about it this way:
Probably not hugely efficient. It creates a list of acceptable words. It then compares each word in each string to that list and if there are no unacceptable words it prints the phrase.
EDIT: I've also realised this doesn't check for grammatical sense. For example the phrase 'little little this a' would still return as correct. It's simply checking for each word. I'll leave this here to display my shame.
I would approach this as follows:
The root of the code is the assignment of
words
, innew_phrases
, which transforms thephrase
andsyns
into a more usable form, a list where each element is a list of the acceptable choices for that word:Note the following:
set
for efficient (O(1)
, vs.O(n)
for a list) membership testing;itertools.product
to generate the possible combinations ofphrase
based on thesyns
(you could also useitertools.ifilter
in implementing this); andIn use:
Things to think about:
"House of Commons"
be treated)?