How do I coalesce a sequence of identical characte

2020-03-02 03:09发布

Suppose I have this:

My---sun--is------very-big---.

I want to replace all multiple hyphens with just one hyphen.

10条回答
虎瘦雄心在
2楼-- · 2020-03-02 03:25

If you really only want to coalesce hyphens, use the other suggestions. Otherwise you can write your own function, something like this:

>>> def coalesce(x):
...     n = []
...     for c in x:
...         if not n or c != n[-1]:
...             n.append(c)
...     return ''.join(n)
...
>>> coalesce('My---sun--is------very-big---.')
'My-sun-is-very-big-.'
>>> coalesce('aaabbbccc')
'abc'
查看更多
太酷不给撩
3楼-- · 2020-03-02 03:26

How about:

>>> import re
>>> re.sub("-+", "-", "My---sun--is------very-big---.")
'My-sun-is-very-big-.'

the regular expression "-+" will look for 1 or more "-".

查看更多
欢心
4楼-- · 2020-03-02 03:32
re.sub('-+', '-', "My---sun--is------very-big---")
查看更多
放荡不羁爱自由
5楼-- · 2020-03-02 03:36

As usual, there's a nice itertools solution, using groupby:

>>> from itertools import groupby
>>> s = 'aaaaa----bbb-----cccc----d-d-d'
>>> ''.join(key for key, group in groupby(s))
'a-b-c-d-d-d'
查看更多
家丑人穷心不美
6楼-- · 2020-03-02 03:39

If you want to replace any run of consecutive characters, you can use

>>> import re
>>> a = "AA---BC++++DDDD-EE$$$$FF"
>>> print(re.sub(r"(.)\1+",r"\1",a))
A-BC+D-E$F

If you only want to coalesce non-word-characters, use

>>> print(re.sub(r"(\W)\1+",r"\1",a))
AA-BC+DDDD-EE$FF

If it's really just hyphens, I recommend unutbu's solution.

查看更多
\"骚年 ilove
7楼-- · 2020-03-02 03:40

Another simple solution is the String object's replace function.

while '--' in astr:
    astr = astr.replace('--','-')
查看更多
登录 后发表回答