utf8 string length

2020-02-13 13:20发布

问题:

strlen() function in php could not return correctly string lenght of utf8 chars, for example سلام is 4 char but after using strlen thats return 8 chr

<?php
echo strlen('سلام');
?>

回答1:

The core PHP string functions all assume 1 character = 1 byte. They have no concept of different encodings. To figure out how many characters are in a UTF-8 string (not how many bytes), use the mb_strlen equivalent and tell it what encoding the string is in:

echo mb_strlen('سلام', 'UTF-8');


回答2:

You can get the number of UTF-8 Codepoints inside a binary PHP string (as long as it is valid UTF-8 encoded) (Demo):

$length = preg_match_all('(.)su', $subject);

You can also use the multibyte extension if you have it installed:

$length = mb_strlen($subject, 'UTF-8');

See also: PHP UTF-8 String Length



标签: php string utf-8