I would like to split a string into sections of numbers and sections of text/symbols my current code doesn't include negative numbers or decimals, and behaves weirdly, adding an empty list element on the end of the output
import re
mystring = 'AD%5(6ag 0.33--9.5'
newlist = re.split('([0-9]+)', mystring)
print (newlist)
current output:
['AD%', '5', '(', '6', 'ag ', '0', '.', '33', '--', '9', '.', '5', '']
desired output:
['AD%', '5', '(', '6', 'ag ', '0.33', '-', '-9.5']
As mentioned here before, there is no option to ignore the empty strings in
re.split()
but you can easily construct a new list the following way:output:
Unfortunately,
re.split()
does not offer an "ignore empty strings" option. However, to retrieve your numbers, you could easily usere.findall()
with a different pattern:Your issue is related to the fact that your regex captures one or more digits and adds them to the resulting list and digits are used as a delimiter, the parts before and after are considered. So if there are digits at the end, the split results in the empty string at the end to be added to the resulting list.
You may split with a regex that matches float or integer numbers with an optional minus sign and then remove empty values:
To match negative/positive numbers with exponents, use
The
-?\d*\.?\d+
regex matches:-?
- an optional minus\d*
- 0+ digits\.?
- an optional literal dot\d+
- one or more digits.