regular expression split : FutureWarning: split()

2020-03-11 06:13发布

问题:

I am getting a warning in Python 3 version when I use split() command as follows:

pattern = re.compile(r'\s*')
match = re.split(pattern, 'I am going to school')
print(match)

python3.6/re.py:212: FutureWarning: split() requires a non-empty pattern match. return _compile(pattern, flags).split(string, maxsplit)

I don't understand why I am getting this warning.

回答1:

You are getting this warning because with the \s* pattern you asked to split on substrings of zero or more whitespaces

But... the empty string matches that pattern, because there are zero whitespaces in it!

It's unclear what re.split should do with this. This is what str.split does:

>>> 'hello world'.split('')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: empty separator
>>>

re.split decides to just throw away that empty substring option, and instead splits on one or more whitespaces. In python3.6 it emits that FutureWarning you're seeing, to tell you about that decision.

You could say that yourself by replacing * with +:

$ python3.6 -c "import re; print(re.split('\s*', 'I am going to school'))"
/usr/lib64/python3.6/re.py:212: FutureWarning: split() requires a non-empty pattern match.
  return _compile(pattern, flags).split(string, maxsplit)
['I', 'am', 'going', 'to', 'school']

$ python3.6 -c "import re; print(re.split('\s+', 'I am going to school'))"
['I', 'am', 'going', 'to', 'school']