For example, I want to split this sentence:
I am a sentence.
Into an array with 5 parts; I
, am
, a
, sentence
, and .
.
I'm currently using preg_split
after trying explode
, but I can't seem to find something suitable.
This is what I've tried:
$sentence = explode(" ", $sentence);
/*
returns array(4) {
[0]=>
string(1) "I"
[1]=>
string(2) "am"
[2]=>
string(1) "a"
[3]=>
string(8) "sentence."
}
*/
And also this:
$sentence = preg_split("/[.?!\s]/", $sentence);
/*
returns array(5) {
[0]=>
string(1) "I"
[1]=>
string(2) "am"
[2]=>
string(1) "a"
[3]=>
string(8) "sentence"
[4]=>
string(0) ""
}
*/
How can this be done?
You can split on word boundaries:
Pretty much the regex scans until a word character is found, then after it, the regex must capture a word boundary and some optional space.
Output:
I was looking for the same solution and landed here. The accepted solution does not work with non-word characters like apostrophes and accent marks and so forth. Below, find the solution that worked for me.
Here is my test sentence:
The accepted answer gave me the following results:
The solution I came up with follows:
It gives the following results: