Replacing substrings given a dictionary of strings

2020-03-31 07:45发布

问题:

I have a dictionary with the strings to be replaced as keys and its replacement as values. Other than looking through the strings token by token, is there a better/faster way of doing the replacement?

I've been doing it as such:

segmenter = {'foobar':'foo bar', 'withoutspace':'without space', 'barbar': 'bar bar'}

sentence = "this is a foobar in a barbar withoutspace"

for i in sentence.split():
  if i in segmenter:
    sentence.replace(i, segmenter[i])

回答1:

String are immutable in python. So, str.replace returns a new string instead of modifying the original string. You can use str.join() and list comprehension here:

>>> segmenter = {'foobar':'foo bar', 'withoutspace':'without space', 'barbar': 'bar bar'}
>>> sentence = "this is a foobar in a barbar withoutspace"

>>> " ".join( [ segmenter.get(word,word) for word in sentence.split()] )
'this is a foo bar in a bar bar without space'

Another problem with str.replace is that it'll also replace words like "abarbarb" with

"abar barb".



回答2:

re.sub can call a function that returns the substitution

segmenter = {'foobar':'foo bar', 'withoutspace':'without space', 'barbar': 'bar bar'}
sentence = "this is a foobar in a barbar withoutspace"

import re

def fn(match):
    return segmenter[match.group()]

print re.sub('|'.join(re.escape(k) for k in segmenter), fn, sentence)