Truncate a multibyte String to n chars

2019-01-01 14:09发布

问题:

I am trying to get this method in a String Filter working:

public function truncate($string, $chars = 50, $terminator = \' …\');

I\'d expect this

$in  = \"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWYXZ1234567890\";
$out = \"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUV …\";

and also this

$in  = \"âãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝ\";
$out = \"âãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđ …\";

That is $chars minus the chars of the $terminator string.

In addition, the filter is supposed to cut at the first word boundary below the $chars limit, e.g.

$in  = \"Answer to the Ultimate Question of Life, the Universe, and Everything.\";
$out = \"Answer to the Ultimate Question of Life, the …\";

I am pretty certain this should work with these steps

  • substract amount of chars in terminator from maximum chars
  • validate that string is longer than the calculated limit or return it unaltered
  • find the last space character in string below calculated limit to get word boundary
  • cut string at last space or calculated limit if no last space is found
  • append terminator to string
  • return string

However, I have tried various combinations of str* and mb_* functions now, but all yielded wrong results. This can\'t be so difficult, so I am obviously missing something. Would someone share a working implementation for this or point me to a resource where I can finally understand how to do it.

Thanks

P.S. Yes, I have checked https://stackoverflow.com/search?q=truncate+string+php before :)

回答1:

Try this:

function truncate($string, $chars = 50, $terminator = \' …\') {
    $cutPos = $chars - mb_strlen($terminator);
    $boundaryPos = mb_strrpos(mb_substr($string, 0, mb_strpos($string, \' \', $cutPos)), \' \');
    return mb_substr($string, 0, $boundaryPos === false ? $cutPos : $boundaryPos) . $terminator;
}

But you need to make sure that your internal encoding is properly set.



回答2:

Just found out PHP already has a multibyte truncate with

  • mb_strimwidth — Get truncated string with specified width

It doesn\'t obey word boundaries though. But handy nonetheless!



回答3:

I don\'t usually like to just code an entire answer to a question like this. But also I just woke up, and I thought maybe your question would get me in a good mood to go program for the rest of the day.

I didn\'t try to run this, but it should work or at least get you 90% of the way there.

function truncate( $string, $chars = 50, $terminate = \' ...\' )
{
    $chars -= mb_strlen($terminate);
    if ( $chars <= 0 )
        return $terminate;

    $string = mb_substr($string, 0, $chars);
    $space = mb_strrpos($string, \' \');

    if ($space < mb_strlen($string) / 2)
        return $string . $terminate;
    else
        return mb_substr($string, 0, $space) . $terminate;
}