I'm trying to put an ellipsis (…) to shorten long descriptions and want to have word boundaries.
Here's my current code eval.in:
# Assume $body is a long text.
$line = $body;
if(strlen($body) > 300 && preg_match('/^.{1,300}\b/su', $body, $match)) {
$line = trim($match[0]) . "…";
}
echo $line;
This actually works pretty well and I like it except that there are times when the word boundary has a punctuation after it.
If I use the code above, I get results like the following:
This is a long description…
or I have punctuations,…
. I would love to remove the punctuation after the last word before putting the ellipsis.
Help?
You can use:
\w
before\b
ensures we don'e addellipsis
after a non-word characterHere is your fixed approach:
See eval.in demo
As I noted in the comments, you can match the punctuation (optionally, with
\p{P}*
), but I forgot that\b
can match both trailing and leading word boundary. By restricting the\b
with the negative lookahead(?!\w)
(like(?!\w)\b
) we only match the trailing word boundary.Besides, the capturing group (
(...)
) is added to the pattern so that we only capture into Group 1 the string with trailing punctuation trimmed out, and the value can be accessed with$match[1]
.