Ellipsis After Certain Number or Characters with W

2019-07-17 07:19发布

I'm trying to put an ellipsis (…) to shorten long descriptions and want to have word boundaries.

Here's my current code eval.in:

# Assume $body is a long text.
$line = $body;
if(strlen($body) > 300 && preg_match('/^.{1,300}\b/su', $body, $match)) {
    $line = trim($match[0]) . "…";
}
echo $line;

This actually works pretty well and I like it except that there are times when the word boundary has a punctuation after it.

If I use the code above, I get results like the following:

This is a long description… or I have punctuations,…. I would love to remove the punctuation after the last word before putting the ellipsis.

Help?

2条回答
冷血范
2楼-- · 2019-07-17 07:45

You can use:

$body = preg_replace('/^(.{0,299}\w)\b.*/su', '$1…', $body);

\w before \b ensures we don'e add ellipsis after a non-word character

查看更多
Lonely孤独者°
3楼-- · 2019-07-17 07:51

Here is your fixed approach:

$body = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam eu congue ex. Nunc sem arcu, fermentum vel feugiat quis, consequat nec enim. Quisque et pulvinar velit, et laoreet justo. Integer quis sapien ac turpis mattis lobortis at at metus. Vestibulum euismod turpis odio, id luctus quam pharetra, at, et. Sed finibus, nunc at ultricies posuere, dui mauris aliquet quam, eget aliquet ligula libero a turpis. Pellentesque eu diam sodales, sollicitudin leo et, sagittis magna. Donec feugiat, velit quis condimentum porttitor, enim sapien varius elit, sit amet pretium risus turpis vitae massa. Sed ac ligula sit amet lorem scelerisque tristique a id ex. Nullam maximus tincidunt magna, vel molestie lectus tempus non. Sed euismod placerat ultricies. Morbi dapibus augue ut odio faucibus, vel maximus nisl pharetra. Aliquam hendrerit dolor in ipsum pharetra, eget tincidunt lacus ultrices.";

$line = $body;
if(strlen($body) > 300 && preg_match('/^(.{1,300})(?!\w)\b\p{P}*/su', $body, $match)) {
    $line = trim($match[1]) . "…";
}
echo $line;

See eval.in demo

As I noted in the comments, you can match the punctuation (optionally, with \p{P}*), but I forgot that \b can match both trailing and leading word boundary. By restricting the \b with the negative lookahead (?!\w) (like (?!\w)\b) we only match the trailing word boundary.

Besides, the capturing group ((...)) is added to the pattern so that we only capture into Group 1 the string with trailing punctuation trimmed out, and the value can be accessed with $match[1].

查看更多
登录 后发表回答