PHP Compare whether strings are (almost) equal

I need to compare names which can be written in several ways. For example, a name like St. Thomas is sometimes written like St-Thomas or Sant Thomas. Preferably, I'm looking to build a function that gives a percentage of 'equalness' to a comparison, like some forums do (this post is 5% edited for example).

标签： php

5条回答

SAY GOODBYE

2楼-- · 2020-07-23 03:49

You can use different approaches.

You can use the similar_text() function to check for similarity.

You can use levenshtein() function to find out...

The Levenshtein distance is defined as the minimal number of characters you have to replace, insert or delete to transform str1 into str2

And then check for a reasonable threshold for your check.

0人赞添加讨论(0) 举报

欢心

3楼-- · 2020-07-23 03:52

Check out levenshtein(), which does what you want and is comparatively efficient (but not extremely efficient): http://www.php.net/manual/en/function.levenshtein.php

0人赞添加讨论(0) 举报

叛逆

4楼-- · 2020-07-23 03:57

PHP has two (main) built-in functions for this.

levenshtein which counts how many changes (remove/add/replacements) are needed to produce string2 from string1. (lower is better)

and

similar_text which returns the number of matching characters (higher is better). Note that you can pass a reference as the third parameter and it'll give you a percentage.

<?php
    $originalPost = "Here's my question to stack overflou. Thanks /h2ooooooo";
    $editedPost = "Question to stack overflow.";
    $matchingCharacters = similar_text($originalPost, $editedPost, $matchingPercentage);
    var_dump($matchingCharacters); //int(25) 
    var_dump($matchingPercentage); //float(60.975609756098) (hence edited 40%)
?>

0人赞添加讨论(0) 举报

地球回转人心会变

5楼-- · 2020-07-23 04:09

The edit distance between two strings of characters generally refers to the Levenshtein distance.

http://php.net/manual/en/function.levenshtein.php

0人赞添加讨论(0) 举报

手持菜刀，她持情操

6楼-- · 2020-07-23 04:10

$v1 = 'pupil';
$v2 = 'people';
# TRUE if $v1 & $v2 have similar  pronunciation
soundex($v1) == soundex($v2);  
# Same but it use a more accurate comparison algorithm                 
metaphone($v1) == metaphone($v2);               
# Calculate how many common characters between 2 strings
# Percent store the percentage of common chars
$common = similar_text($v1, $v2, $percent);     
# Compute the difference of 2 text                                                 
$diff = levenshtein($v1, $v2);

So, either levenshtein($v1, $v2) or similar_text($v1, $v2, $percent) will do it for you but still there is tradeoff. The complexity of the levenshtein() algorithm is O(m*n), where n and m are the length of v1 and v2 (rather good when compared to similar_text(), which is O(max(n,m)**3), but still expensive).

0人赞添加讨论(0) 举报

PHP Compare whether strings are (almost) equal

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间