How can I split a text into an array of sentences?
Example text:
Fry me a Beaver. Fry me a Beaver! Fry me a Beaver? Fry me Beaver no. 4?! Fry me many Beavers... End
Should output:
0 => Fry me a Beaver.
1 => Fry me a Beaver!
2 => Fry me a Beaver?
3 => Fry me Beaver no. 4?!
4 => Fry me many Beavers...
5 => End
I tried some solutions that I've found on SO through search, but they all fail, especially at the 4th sentence.
/(?<=[!?.])./
/\.|\?|!/
/((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]\"))(\s|\r\n)(?=\"?[A-Z])/
/(?<=[.!?]|[.!?][\'"])\s+/ // <- closest one
Since you want to "split" sentences why are you trying to match them ?
For this case let's use preg_split().
Code:
Output:
Explanation:
Well to put it simply we are spliting by grouped space(s) \s+ and doing two things:
(?<=[.?!]) Positive look behind assertion, basically we search if there is a point or question mark or exclamation mark behind the space.
(?=[a-z]) Positive look ahead assertion, searching if there is a letter after the space, this is kind of a workaround for the
no. 4
problem.