how to use similar text php code in arabic

Trying to use php similar_text() with arabic, but it's not working. However it works great with english.

<?php 
$var = similar_text("ياسر","عمار","$per");
echo $var;
?>
outbot : 5

that's wrong result, it should be 2. Is there similar_text() with arabic letters?

标签： php string function

3条回答

爷的心禁止访问

2楼-- · 2020-07-24 13:23

Here's one I'm using

//from http://www.phperz.com/article/14/1029/31806.html
function mb_split_str($str) {
    preg_match_all("/./u", $str, $arr);
    return $arr[0];
}

//based on http://www.phperz.com/article/14/1029/31806.html, added percent
function mb_similar_text($str1, $str2, &$percent) {
    $arr_1 = array_unique(mb_split_str($str1));
    $arr_2 = array_unique(mb_split_str($str2));
    $similarity = count($arr_2) - count(array_diff($arr_2, $arr_1));
    $percent = ($similarity * 200) / (strlen($str1) + strlen($str2) );
    return $similarity;
}

$var = mb_similar_text('عمار', 'ياسر', $per);
output: $var = 2, $per = 25

0人赞添加讨论(0) 举报

孤傲高冷的网名

3楼-- · 2020-07-24 13:32

Just for the record and hopefully to make some help, I want to clarify the behavior of the similar_text() function when some multi-byte character strings are given (including the character strings of the Arabic.)

The function simply treats each byte of the input string as an individual character (which implies it neither supports multi-byte characters nor the Unicode.)

The byte streams of the عمار and ياسر strings are respectively represented as the following (the bytes (in the hexadecimal representation) are separated using . and, where the end of a character is reached, then a : is used instead):

06.39:06.45:06.27:06.31   <-- Byte stream for عمار
||    ||    ||    || ||
06.4A:06.27:06.33:06.31   <-- Byte stream for ياسر

As you can tell, there are five matching, and that's the reason why the function returns 5 in this case (every two hexadecimal digits represent a byte.)

0人赞添加讨论(0) 举报

Viruses.

4楼-- · 2020-07-24 13:36

Because the Arabic text are multibyte strings normal PHP functions cannot be used (such as 'similar_text()').

echo(strlen("عمار"));

The above code outputs: 8

echo(mb_strlen("عمار", "UTF-8"));

Using the mb_strlen function with the UTF-8 encoding specified, the output is: 4 (the correct number of characters).

You can use the mb_ functions to make your own version of the similar_text function: http://php.net/manual/en/ref.mbstring.php

0人赞添加讨论(0) 举报

how to use similar text php code in arabic

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间