Issues with replacing words in a string using a di

2019-08-03 09:26发布

问题:

Say I have a dictionary, a string and a list of the words in that string. Like this:

the_dictionary={'mine': 'yours', 'I': 'you', 'yours': 'mine', 'you': 'I'}

the_string='I thought that was yours'

list_string=['I','thought','that','was','yours']

This is my code:

for word in list_string:            
        if word in the_dictionary:
            the_string=the_string.replace(word,the_dictionary[word],1)
print(the_string)

input: I thought that was yours

Output: you thought that was mine

Here everything works great, but if I change the input to:

the_string="That is mine that is yours"

Input: That is mine that is yours

Output: That is mine that is yours

Nothing changes.

Obviously it has something to do with the fact that they are a key-value pair but my hope is that this can be solved somehow.

My question: Why does this happen and can it be fixed?

Please keep in mind that I am still sort of a beginner and would appreciate it if you could pretend I am child while explaining it.

Thanks for taking the time /wazus

回答1:

The issue is that you are calling replace on the_string each time, and when called with the optional argument, replace replaces the first occurrences of the source string.

So, the first time you encounter mine in list_string, the_string gets changed to That is yours that is yours. So far, this is what is expected.

But later, you encounter yours in list_string, and you say the_string = the_string.replace('yours', 'mine', 1). So, the first occurrence of yours in the_string gets replaced with mine, which brings us back to the original string.

Here's one way to fix it:

In [78]: the_string="That is mine that is yours"

In [79]: the_dictionary={'mine': 'yours', 'I': 'you', 'yours': 'mine', 'you': 'I'}

In [80]: list_string = the_string.split()

In [81]: for i,word in enumerate(list_string):
    if word in the_dictionary:
        list_string[i] = the_dictionary[word]
   ....:         

In [82]: print(' '.join(list_string))
That is yours that is mine


回答2:

Here's what's happening in your second exemple. Originally, you have :

the_string = "That is mine, that is yours"

Your script changes the first "mine" into "yours" which gives :

the_string = "That is yours, that is yours"

Then, when scanning the string again, it changes BACK the first "yours" (which was just changed !) back to "mine", giving you the original phrase again :

the_string = "That is mine, that is yours"

Well, then : why didn't it do the same for the first string ? Because it depends on which order it will pick the words in your dictionary, and there's no way to decide that. Sometimes you will get lucky and it will work, sometimes not.

First, you want to make sure that once a word is changed, it doesn't get changed back again. So, from the structure of your original script, it's better to change the list than the string. You enumerate each item in the list, if the item is in the dictionary KEYS (yup : you should always look for the keys, not for the word themselves) you change it. Then you change back the list into a string :

the_dictionary = {'I': 'you', 'mine': 'yours','yours': 'mine', 'you': 'I'}

the_string1 = 'I thought that was yours'
the_string2 = 'That is mine that is yours'


list_string1 = ['I','thought','that','was','yours']
list_string2 = ['Thas','is','mine','thas','is','yours']


for i,word in enumerate(list_string1) :
    if word in the_dictionary.keys():
        list_string1[i] = the_dictionary[word]
the_string1 = "%s "*len(list_string1) % tuple(list_string1)

for i,word in enumerate(list_string2) :
    if word in the_dictionary.keys() :
        list_string2[i] = the_dictionary[word]
the_string2 = "%s "*len(list_string2) % tuple(list_string2)

print(the_string1)
print(the_string2)

I used enumerate() which makes it easier to access both the index and the item of a list. Then I used a little trick to change the list back into a string. Not sure it's the best way... Of course, the better way would be to wrap all that up into a function. You can even change the string to a list with the regular expression module :

import re
the_string_list = re.findall(r'\w+',the_string)

Hope it helps !