fast way to remove lowercase substrings from strin

2020-03-10 05:21发布

What's an efficient way in Python (plain or using numpy) to remove all lowercase substring from a string s?

s = "FOObarFOOObBAR"
remove_lower(s) => "FOOFOOBAR"

3条回答
来,给爷笑一个
2楼-- · 2020-03-10 05:38
import re

remove_lower = lambda text: re.sub('[a-z]', '', text)

s = "FOObarFOOObBAR"
s = remove_lower(s)

print(s)
查看更多
叛逆
3楼-- · 2020-03-10 05:49

My first approach would be ''.join(x for x in s if not x.islower())

If you need speed use mgilson answer, it is a lot faster.

>>> timeit.timeit("''.join(x for x in 'FOOBarBaz' if not x.islower())")
3.318969964981079

>>> timeit.timeit("'FOOBarBaz'.translate(None, string.ascii_lowercase)", "import string")
0.5369198322296143

>>> timeit.timeit("re.sub('[a-z]', '', 'FOOBarBaz')", "import re")
3.631659984588623

>>> timeit.timeit("r.sub('', 'FOOBarBaz')", "import re; r = re.compile('[a-z]')")
1.9642360210418701

>>> timeit.timeit("''.join(x for x in 'FOOBarBaz' if x not in lowercase)", "lowercase = set('abcdefghijklmnopqrstuvwxyz')")
2.9605889320373535
查看更多
够拽才男人
4楼-- · 2020-03-10 05:58

I'd use str.translate. Only the delete step is performed if you pass None for the translation table. In this case, I pass the ascii_lowercase as the letters to be deleted.

>>> import string
>>> s.translate(None,string.ascii_lowercase)
'FOOFOOOBAR'

I doubt you'll find a faster way, but there's always timeit to compare different options if someone is motivated :).

查看更多
登录 后发表回答