I'm trying to split a string: multiple characters such as !!!
, ??
, ...
denote the end of the sentence so I want anything after this to be on a new line e.g. sentence hey.. hello split !!! example me.
should be turned into:
hey..
hello split !!!
example me.
What I tried:
String myStr= "hey.. hello split !!! example me.";
String [] split = myStr.split("(?<=\\.{2,})");
This works fine when I have multiple dots but doesn't work for anything else, I can't add exclamation marks to this expression too "(?<=[\\.{2,}!{2,}])
. This splits after each dot and exclamation. Is there any way to combine those ?
Ideally I wanted the app to split after a SINGLE dot too (anything that denotes the end of the sentence) but I don't think this is possible in a single pass...Thanks
Just do like this,oir
It captures the first character from the list
[?.!]
and expects the same character to be present one or more times. If yes, then the splitting should occur next to this.or
Ideone
To do this first you have to lay down as to what cases are you considering as end of sentence. Multiple special symbols are not standard form of ending a sentence (as per my knowledge).
But if you are keeping in mind the nefarious users or some casual mistakes ending up making special symbols look like end of sentence then at least make a list of such cases and then proceed.
For your situation here where you want to split the string on multiple special symbols. Lookbehind won't be of much help because as Wiktor noted
So we need to find that zero-width where splitting needs to be done. And following regex does the same.
Regex:
(?<=[.!?])(?=[^.!?])
Regex101 Demo Ideone Demo(?<=[.!?]) (?=[^.!?])
Regex101 Demo Ideone DemoNote the
space
between two assertions in second regex.If you want to consume the precedingspace
when start next line.Explanation:
A look behind, with a negative look to prevent split within the group:
Some test code:
Output: