I have a multiline string which is delimited by a set of different delimiters:
(Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4)
I can split this string into its parts, using String.split
, but it seems that I can't get the actual string, which matched the delimiter regex.
In other words, this is what I get:
Text1
Text2
Text3
Text4
This is what I want
Text1
DelimiterA
Text2
DelimiterC
Text3
DelimiterB
Text4
Is there any JDK way to split the string using a delimiter regex but also keep the delimiters?
I suggest using Pattern and Matcher, which will almost certainly achieve what you want. Your regular expression will need to be somewhat more complicated than what you are using in String.split.
Fast answer: use non physical bounds like \b to split. I will try and experiment to see if it works (used that in PHP and JS).
It is possible, and kind of work, but might split too much. Actually, it depends on the string you want to split and the result you need. Give more details, we will help you better.
Another way is to do your own split, capturing the delimiter (supposing it is variable) and adding it afterward to the result.
My quick test:
Result:
A bit too much... :-)
Another candidate solution using a regex. Retains token order, correctly matches multiple tokens of the same type in a row. The downside is that the regex is kind of nasty.
Sample output:
I know this is a very-very old question and answer has also been accepted. But still I would like to submit a very simple answer to original question. Consider this code:
OUTPUT:
I am just using word boundary
\b
to delimit the words except when it is start of text.