可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have been using
explode(".",$mystring)
to split a paragraph into sentences. However this doen't cover sentences that have been concluded with different punctuation such as ! ? : ;
Is there a way of using an array as a delimiter instead of a single character? Alternativly is there another neat way of splitting using various punctuation?
I tried
explode(("." || "?" || "!"),$mystring)
hopefully but it didn't work...
回答1:
You can do:
preg_split('/\.|\?|!/',$mystring);
or (simpler):
preg_split('/[.?!]/',$mystring);
回答2:
You can use preg_split()
combined with a PCRE lookahead condition to split the string after each occurance of .
, ;
, :
, ?
, !
, .. while keeping the actual punctuation intact:
Code:
$subject = 'abc sdfs. def ghi; this is an.email@addre.ss! asdasdasd? abc xyz';
// split on whitespace between sentences preceded by a punctuation mark
$result = preg_split('/(?<=[.?!;:])\s+/', $subject, -1, PREG_SPLIT_NO_EMPTY);
print_r($result);
Result:
Array
(
[0] => abc sdfs.
[1] => def ghi;
[2] => this is an.email@addre.ss!
[3] => asdasdasd?
[4] => abc xyz
)
回答3:
Assuming that you actually want the punctuations marks with the end result, have you tried:
$mystring = str_replace("?","?---",str_replace(".",".---",str_replace("!","!---",$mystring)));
$tmp = explode("---",$mystring);
Which would leave your punctuation marks in tact.
回答4:
preg_split('/\s+|[.?!]/',$string);
A possible problem might be if there is an email address as it could split it onto a new line half way through.
回答5:
Use preg_split and give it a regex like [\.|\?!] to split on
回答6:
You can try preg_split
$sentences = preg_split("/[\.\?\!,;]+/", $mystring);
Please note this will remove the punctuations. If you would like to strip out leading or trailing whitespace as well
$sentences = preg_split("/[\.\?\!,;]+\s+?/", $mystring);
回答7:
$mylist = preg_split("/[\.|\?!:;]/", $mystring);
回答8:
You can't have multiple delimiters for explode. That's what preg_split();
is for. But even then, it explodes at the delimiter, so you will get sentences returned without the punctuation marks.
You can take preg_split a step farther and flag it to return them in their own elements with PREG_SPLIT_DELIM_CAPTURE and then run some loop to implode sentence and following punctation mark in the returned array, or just use preg_match_all();
:
preg_match_all('~.*?[?.!]~s', $string, $sentences);