Measure string size in Bytes in php

2019-01-10 22:04发布

I am doing a real estate feed for a portal and it is telling me the max length of a string should be 20,000 bytes (20kb), but I have never run across this before.

How can I measure byte size of a varchar string. So I can then do a while loop to trim it down.

5条回答
叛逆
2楼-- · 2019-01-10 22:42

You can use mb_strlen() to get the byte length using a encoding that only have byte-characters, without worring about multibyte or singlebyte strings. For example, as drake127 saids in a comment of mb_strlen, you can use '8bit' encoding:

<?php
    $string = 'Cién cañones por banda';
    echo mb_strlen($string, '8bit');
?>

You can have problems using strlen function since php have an option to overload strlen to actually call mb_strlen. See more info about it in http://php.net/manual/en/mbstring.overload.php

For trim the string by byte length without split in middle of a multibyte character you can use:

mb_strcut(string $str, int $start [, int $length [, string $encoding ]] )
查看更多
beautiful°
3楼-- · 2019-01-10 22:48

PHP's strlen() function returns the number of ASCII characters.

strlen('borsc') -> 5 (bytes)

strlen('boršč') -> 7 (bytes)

$limit_in_kBytes = 20000;

$pointer = 0;
while(strlen($your_string) > (($pointer + 1) * $limit_in_kBytes)){
    $str_to_handle = substr($your_string, ($pointer * $limit_in_kBytes ), $limit_in_kBytes);
    // here you can handle (0 - n) parts of string
    $pointer++;
}

$str_to_handle = substr($your_string, ($pointer * $limit_in_kBytes), $limit_in_kBytes);
// here you can handle last part of string

.. or you can use a function like this:

function parseStrToArr($string, $limit_in_kBytes){
    $ret = array();

    $pointer = 0;
    while(strlen($string) > (($pointer + 1) * $limit_in_kBytes)){
        $ret[] = substr($string, ($pointer * $limit_in_kBytes ), $limit_in_kBytes);
        $pointer++;
    }

    $ret[] = substr($string, ($pointer * $limit_in_kBytes), $limit_in_kBytes);

    return $ret;
}

$arr = parseStrToArr($your_string, $limit_in_kBytes = 20000);
查看更多
Animai°情兽
4楼-- · 2019-01-10 22:50

Further to PhoneixS answer to get the correct length of string in bytes - Since mb_strlen() is slower than strlen(), for the best performance one can check "mbstring.func_overload" ini setting so that mb_strlen() is used only when it is really required:

$content_length = ini_get('mbstring.func_overload') ? mb_strlen($content , '8bit') : strlen($content);
查看更多
啃猪蹄的小仙女
5楼-- · 2019-01-10 22:51

Do you mean byte size or string length?

Byte size is measured with strlen(), whereas string length is queried using mb_strlen(). You can use substr() to trim a string to X bytes (note that this will break the string if it has a multi-byte encoding - as pointed out by Darhazer in the comments) and mb_substr() to trim it to X characters in the encoding of the string.

查看更多
叛逆
6楼-- · 2019-01-10 23:08

You have to figure out if the string is ascii encoded or encoded with a multi-byte format.

In the former case, you can just use strlen.

In the latter case you need to find the number of bytes per character.

the strlen documentation gives an example of how to do it : http://www.php.net/manual/en/function.strlen.php#72274

查看更多
登录 后发表回答