I want split paragraph string into array of sentences. Of course I am using regular expression with character dot (.) to split the paragraph into sentences. The problem is academic title abbreviation in the sentences, every abbreviation is using dot (.). So my regex totally wrong to split the paragraph.
Here is an example of paragraph :
Meanwhile Rector of Bogor Agricultural University, Prof. Dr. Herry Suhardiyanto, in his remarks requested that the graduate students should keep on studying and will finalize their studies on time. Present in that general audience were the Deputy Dean of the Graduate School of Bogor Agricultural University, Dr.Dedi Jusadi, Secretary of the Graduate School for Doctoral Program of Bogor Agricultural University, Prof.Dr. Marimin.
Only using dot (.) as regex, I get :
Array (
[0] => Meanwhile Rector of Bogor Agricultural University, Prof
[1] => Dr
[2] => Herry Suhardiyanto, in his remarks requested that the graduate students should keep on studying and will finalize their studies on time
[3] => ...
)
And this actually I wanted :
Array (
[0] => Meanwhile Rector of Bogor Agricultural University, Prof. Dr. Herry Suhardiyanto, in his remarks requested that the graduate students should keep on studying and will finalize their studies on time
[1] => Present in that general audience were the Deputy Dean of the Graduate School of Bogor Agricultural University, Dr.Dedi Jusadi, Secretary of the Graduate School for Doctoral Program of Bogor Agricultural University, Prof.Dr. Marimin
)
This seems to work, but is a new PHP function vs. strictly RegEx -
You could use Negative Lookbehinds:
((?<!Prof)(?<!Dr)(?<!Mr)(?<!Mrs)(?<!Ms))\.
add more if neededExplained demo here: http://regex101.com/r/xQ3xF9
And the code could look like this: