Python replace function [replace once]

I need help with a program I'm making in Python.

Assume I wanted to replace every instance of the word "steak" to "ghost" (just go with it...) but I also wanted to replace every instance of the word "ghost" to "steak" at the same time. The following code does not work:

 s="The scary ghost ordered an expensive steak"
 print s
 s=s.replace("steak","ghost")
 s=s.replace("ghost","steak")
 print s

it prints: The scary steak ordered an expensive steak

What I'm trying to get is The scary steak ordered an expensive ghost

标签： python string replace

6条回答

甜甜的少女心

2楼-- · 2019-01-09 01:24

Rename one of the words to a temp value that doesn't occur in the text. Note this wouldn't be the most efficient way for a very large text. For that a re.sub might be more appropriate.

 s="The scary ghost ordered an expensive steak"
 print s
 s=s.replace("steak","temp")
 s=s.replace("ghost","steak")
 S=s.replace("temp","steak")
 print s

0人赞添加讨论(0) 举报

放我归山

3楼-- · 2019-01-09 01:37

I'd probably use a regex here:

>>> import re
>>> s = "The scary ghost ordered an expensive steak"
>>> sub_dict = {'ghost':'steak','steak':'ghost'}
>>> regex = '|'.join(sub_dict)
>>> re.sub(regex, lambda m: sub_dict[m.group()], s)
'The scary steak ordered an expensive ghost'

Or, as a function which you can copy/paste:

import re
def word_replace(replace_dict,s):
    regex = '|'.join(replace_dict)
    return re.sub(regex, lambda m: replace_dict[m.group()], s)

Basically, I create a mapping of words that I want to replace with other words (sub_dict). I can create a regular expression from that mapping. In this case, the regular expression is "steak|ghost" (or "ghost|steak" -- order doesn't matter) and the regex engine does the rest of the work of finding non-overlapping sequences and replacing them accordingly.

Some possibly useful modifications

regex = '|'.join(map(re.escape,replace_dict)) -- Allows the regular expressions to have special regular expression syntax in them (like parenthesis). This escapes the special characters to make the regular expressions match the literal text.
regex = '|'.join(r'\b{0}\b'.format(x) for x in replace_dict) -- make sure that we don't match if one of our words is a substring in another word. In other words, change he to she but not the to tshe.

0人赞添加讨论(0) 举报

你好瞎i

4楼-- · 2019-01-09 01:42

Note Considering the viewership of this Question, I undeleted and rewrote it for different types of test cases

I have considered four competing implementations from the answers

>>> def sub_noregex(hay):
    """
    The Join and replace routine which outpeforms the regex implementation. This
    version uses generator expression
    """
    return 'steak'.join(e.replace('steak','ghost') for e in hay.split('ghost'))

>>> def sub_regex(hay):
    """
    This is a straight forward regex implementation as suggested by @mgilson
    Note, so that the overheads doesn't add to the cummulative sum, I have placed
    the regex creation routine outside the function
    """
    return re.sub(regex,lambda m:sub_dict[m.group()],hay)

>>> def sub_temp(hay, _uuid = str(uuid4())):
    """
    Similar to Mark Tolonen's implementation but rather used uuid for the temporary string
    value to reduce collission
    """
    hay = hay.replace("steak",_uuid).replace("ghost","steak").replace(_uuid,"steak")
    return hay

>>> def sub_noregex_LC(hay):
    """
    The Join and replace routine which outpeforms the regex implementation. This
    version uses List Comprehension
    """
    return 'steak'.join([e.replace('steak','ghost') for e in hay.split('ghost')])

A generalized timeit function

>>> def compare(n, hay):
    foo = {"sub_regex": "re",
           "sub_noregex":"",
           "sub_noregex_LC":"",
           "sub_temp":"",
           }
    stmt = "{}(hay)"
    setup = "from __main__ import hay,"
    for k, v in foo.items():
        t = Timer(stmt = stmt.format(k), setup = setup+ ','.join([k, v] if v else [k]))
        yield t.timeit(n)

And the generalized test routine

>>> def test(*args, **kwargs):
    n = kwargs['repeat']
    print "{:50}{:^15}{:^15}{:^15}{:^15}".format("Test Case", "sub_temp",
                             "sub_noregex ", "sub_regex",
                             "sub_noregex_LC ")
    for hay in args:
        hay, hay_str = hay
        print "{:50}{:15.10}{:15.10}{:15.10}{:15.10}".format(hay_str, *compare(n, hay))

And the Test Results are as follows

>>> test((' '.join(['steak', 'ghost']*1000), "Multiple repeatation of search key"),
         ('garbage '*998 + 'steak ghost', "Single repeatation of search key at the end"),
         ('steak ' + 'garbage '*998 + 'ghost', "Single repeatation of at either end"),
         ("The scary ghost ordered an expensive steak", "Single repeatation for smaller string"),
         repeat = 100000)
Test Case                                            sub_temp     sub_noregex      sub_regex   sub_noregex_LC 
Multiple repeatation of search key                   0.2022748797   0.3517142003   0.4518992298   0.1812594258
Single repeatation of search key at the end          0.2026047957   0.3508259952   0.4399926194   0.1915298898
Single repeatation of at either end                  0.1877455356   0.3561734007   0.4228843986   0.2164233388
Single repeatation for smaller string                0.2061019057   0.3145984487   0.4252060592   0.1989413449
>>>

Based on the Test Result

Non Regex LC and the temp variable substitution have better performance though the performance of the usage of temp variable is not consistent
LC version has better performance compared to generator (confirmed)
Regex is more than two times slower (so if the piece of code is a bottleneck then the implementation change can be reconsidered)
The Regex and non regex versions are equivalently Robust and can scale

0人赞添加讨论(0) 举报

beautiful°

5楼-- · 2019-01-09 01:43

How about something like this? Store the original in a split list, then have a translation dict. Keeps your core code short, then just adjust the dict when you need to adjust the translation. Plus, easy to port to a function:

 def translate_line(s, translation_dict):
    line = []
    for i in s.split():
       # To take account for punctuation, strip all non-alnum from the
       # word before looking up the translation.
       i = ''.join(ch for ch in i if ch.isalnum()]
       line.append(translation_dict.get(i, i))
    return ' '.join(line)


 >>> translate_line("The scary ghost ordered an expensive steak", {'steak': 'ghost', 'ghost': 'steak'})
 'The scary steak ordered an expensive ghost'

0人赞添加讨论(0) 举报

聊天终结者

6楼-- · 2019-01-09 01:45

Split the string by one of the targets, do the replace, and put the whole thing back together.

pieces = s.split('steak')
s = 'ghost'.join(piece.replace('ghost', 'steak') for piece in pieces)

This works exactly as .replace() would, including ignoring word boundaries. So it will turn "steak ghosts" into "ghost steaks".

0人赞添加讨论(0) 举报

做自己的国王

7楼-- · 2019-01-09 01:46

Use the count variable in the string.replace() method. So using your code, you wouold have:

s="The scary ghost ordered an expensive steak"
print s
s=s.replace("steak","ghost", 1)
s=s.replace("ghost","steak", 1)
print s

http://docs.python.org/2/library/stdtypes.html

0人赞添加讨论(0) 举报

Python replace function [replace once]

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间