I would like to split a string, with multiple delimiters, but keep the delimiters in the resulting list. I think this is a useful thing to do an an initial step of parsing any kind of formula, and I suspect there is a nice Python solution.
Someone asked a similar question in Java here.
For example, a typical split looks like this:
>>> s='(twoplusthree)plusfour'
>>> s.split(f, 'plus')
['(two', 'three)', 'four']
But I'm looking for a nice way to add the plus back in (or retain it):
['(two', 'plus', 'three)', 'plus', 'four']
Ultimately I'd like to do this for each operator and bracket, so if there's a way to get
['(', 'two', 'plus', 'three', ')', 'plus', 'four']
all in one go, then all the better.
Here is an easy way using
re.split
:Output:
re.split
is very similar tostring.split
except that instead of a literal delimiter you pass a regex pattern. The trick here is to put () around the pattern so it gets extracted as a group.Bear in mind that you'll have empty strings if there are two consecutive occurrencies of the delimiter pattern
output:
Here i'm spliting a string on first occurance of alpha characters:
You can do that with Python's
re
module.You can leave out the list if you only need an iterator.
this thread is old, but since its top google result i thought of adding this:
if you dont want to use regex there is a simpler way to do it. basically just call split, but put back the separator except on the last token