replace item in a string if it matches an item in

2019-03-21 17:48发布

问题:

I am trying to remove words from a string if they match a list.

x = "How I Met Your Mother 7x17 (HDTV-LOL) [VTV] - Mon, 20 Feb 2012"

tags = ['HDTV', 'LOL', 'VTV', 'x264', 'DIMENSION', 'XviD', '720P', 'IMMERSE']

print x

for tag in tags:
    if tag in x:
        print x.replace(tag, '')

It produces this output:

How I Met Your Mother 7x17 (HDTV-LOL) [VTV] - Mon, 20 Feb 2012
How I Met Your Mother 7x17 (-LOL) [VTV] - Mon, 20 Feb 2012
How I Met Your Mother 7x17 (HDTV-) [VTV] - Mon, 20 Feb 2012
How I Met Your Mother 7x17 (HDTV-LOL) [] - Mon, 20 Feb 2012

I want it to remove all the words matching the list.

回答1:

You are not keeping the result of x.replace(). Try the following instead:

for tag in tags:
    x = x.replace(tag, '')
print x

Note that your approach matches any substring, and not just full words. For example, it would remove the LOL in RUN LOLA RUN.

One way to address this would be to enclose each tag in a pair of r'\b' strings, and look for the resulting regular expression. The r'\b' would only match at word boundaries:

for tag in tags:
    x = re.sub(r'\b' + tag + r'\b', '', x)


回答2:

The method str.replace() does not change the string in place -- strings are immutable in Python. You have to bind x to the new string returned by replace() in each iteration:

for tag in tags:
    x = x.replace(tag, "")

Note that the if statement is redundant; str.replace() won't do anything if it doesn't find a match.



回答3:

Using your variables tags and x, you can use this:

output = reduce(lambda a,b: a.replace(b, ''), tags, x)

returns:

'How I Met Your Mother 7x17 (-) [] - Mon, 20 Feb 2012'


回答4:

(1) x.replace(tag, '') does not modify x, but rather returns a new string with the replacement.

(2) why are you printing on each iteration?

The simplest modification you could do would be:

for tag in tags:
     x = x.replace(tag, '')