Is there any equivalent to str.split
in Python that also returns the delimiters?
I need to preserve the whitespace layout for my output after processing some of the tokens.
Example:
>>> s="\tthis is an example"
>>> print s.split()
['this', 'is', 'an', 'example']
>>> print what_I_want(s)
['\t', 'this', ' ', 'is', ' ', 'an', ' ', 'example']
Thanks!
Thanks guys for pointing for the
re
module, I'm still trying to decide between that and using my own function that returns a sequence...If I had time I'd benchmark them xD
How about
Have you looked at pyparsing? Example borrowed from the pyparsing wiki:
the
re
module provides this functionality:(quoted from the Python documentation).
For your example (split on whitespace), use
re.split('(\s+)', '\tThis is an example')
.The key is to enclose the regex on which to split in capturing parentheses. That way, the delimiters are added to the list of results.
Edit: As pointed out, any preceding/trailing delimiters will of course also be added to the list. To avoid that you can use the
.strip()
method on your input string first.