Check unicode in PHP

2楼-- · 2020-05-20 08:16

you can try with

mb_check_encoding($s,"UTF-8")

link

0人赞添加讨论(0) 举报

我想做一个坏孩纸

3楼-- · 2020-05-20 08:16

You'd usually do something like:

if (mb_strlen($ch) != strlen($ch)) ...

I should add: strlen counts bytes, while mb_strlen counts characters (properly handling multi-byte characters, which I guess is what you're really talking about rather than unicode - as unicode also covers over a hundred single-byte characters indistinguishable from ASCII)

0人赞添加讨论(0) 举报

【Aperson】

4楼-- · 2020-05-20 08:16

A unicode character will ALWAYS have the most significant byte set no matter what the value of the character is or if it's part of a multi-byte unicode character or what. You can't just check to see if the string has more bytes than characters since some unicode characters are only one byte. If any character in a string's byte value is greater than 127, that string contains unicode.

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

5楼-- · 2020-05-20 08:19

Strings in PHP are bytestreams - not character streams. You can't actually have unicode strings in PHP; You need to encode your characters with some encoding. If you want to cover the entire unicode range, UTF-8 is the most obvious choice.

If you want to get the codepoint of a utf-8 encoded bytestream, you can use this library: http://hsivonen.iki.fi/php-utf8/

However, I wonder what exactly you need this for? Most likely, you can solve all your woes by simply using utf-8.

0人赞添加讨论(0) 举报

Deceive 欺骗

6楼-- · 2020-05-20 08:20

Actually you don't even need the mb_string extension:

if (strlen($string) != strlen(utf8_decode($string)))
{
    echo 'is unicode';
}

And to find the code point of a given character:

$ord = unpack('N', mb_convert_encoding($string, 'UCS-4BE', 'UTF-8'));

echo $ord[1];

0人赞添加讨论(0) 举报

做自己的国王

7楼-- · 2020-05-20 08:21

Thanks guys .. Finally i got the answer i was looking for .

Got an include file from http://hsivonen.iki.fi/php-utf8/.

The following code solved my problem:

<?php
  require_once("utf8.inc");
  /*** create a unicode string ***/
  $s = "حملة إلا صلاتي";
  $out = utf8ToUnicode($s);
  for ($i=0;$i < strlen($s);$i++)
    echo dechex($out[$i]).".";
?>

0人赞添加讨论(0) 举报

Check unicode in PHP

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间