I have a multiline string which is delimited by a set of different delimiters:
(Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4)
I can split this string into its parts, using String.split
, but it seems that I can't get the actual string, which matched the delimiter regex.
In other words, this is what I get:
Text1
Text2
Text3
Text4
This is what I want
Text1
DelimiterA
Text2
DelimiterC
Text3
DelimiterB
Text4
Is there any JDK way to split the string using a delimiter regex but also keep the delimiters?
Tweaked Pattern.split() to include matched pattern to the list
Added
Full source
An extremely naive and inefficient solution which works nevertheless.Use split twice on the string and then concatenate the two arrays
I got here late, but returning to the original question, why not just use lookarounds?
output:
EDIT: What you see above is what appears on the command line when I run that code, but I now see that it's a bit confusing. It's difficult to keep track of which commas are part of the result and which were added by
Arrays.toString()
. SO's syntax highlighting isn't helping either. In hopes of getting the highlighting to work with me instead of against me, here's how those arrays would look it I were declaring them in source code:I hope that's easier to read. Thanks for the heads-up, @finnw.
I had a look at the above answers and honestly none of them I find satisfactory. What you want to do is essentially mimic the Perl split functionality. Why Java doesn't allow this and have a join() method somewhere is beyond me but I digress. You don't even need a class for this really. Its just a function. Run this sample program:
Some of the earlier answers have excessive null-checking, which I recently wrote a response to a question here:
https://stackoverflow.com/users/18393/cletus
Anyway, the code:
I don't think it is possible with
String#split
, but you can use aStringTokenizer
, though that won't allow you to define your delimiter as a regex, but only as a class of single-digit characters:Pass the 3rd aurgument as "true". It will return delimiters as well.