Behavior of array_diff_uassoc not clear

2019-06-21 05:01发布

First of all I need to mention that I digged into manual and php docs and didnt find an answer. Here's a code I use:

class chomik {

    public $state = 'normal';
    public $name = 'no name';

    public function __construct($name) {
        $this->name = $name;
    }

    public function __toString() {
        return $this->name . " - " . $this->state;
    }
}

function compare($a, $b) {
    echo("$a : $b<br/>");
    if($a != $b) {
        return 0;
    }
    else return 1;
}

$chomik = new chomik('a');
$a = array(5, $chomik, $chomik, $chomik);
$b = array($chomik, 'b', 'c', 'd');
array_diff_uassoc($a, $b, 'compare');

What I thought, array_diff_uassoc will compare all values of these two arrays, and if values exists, then will run key comparison. And the output of this code is:

1 : 0
3 : 1
2 : 1
3 : 2
1 : 0
3 : 1
2 : 1
3 : 2
3 : 3
3 : 2
2 : 3
1 : 3
0 : 3

So first of all why some pairs (1 : 0 or 3 : 1) are duplicated? Does it mean function forgot that it already compared this items? I thought that it will compare all equal-by-value pairs, but I dont see it in output. Am I missing something?

So question is: what is exact behavior of this function in terms of order of comparison, and why I see this duplicates? (my PHP version, if it helps is: PHP Version 5.3.6-13ubuntu3.6)

I'm really confused, and waiting for some good explanation of it...

3条回答
爷、活的狠高调
2楼-- · 2019-06-21 05:49

from op's comment that

I want only these elements which are not in second array ($a[0])

can't you use array_diff($a, $b);? it returns

array(1) {
  [0]=>
  int(5)
}

otherwise,

The documentation states that:

The comparison function must return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second.

As I understand it, that means that the compare() function should be more like this:

function compare($a, $b) {
    echo("$a : $b<br/>");
    if($a === $b) return 0;
    else if ($a > $b) return 1;
    else return -1;
}

However even with this correction, it has very strange compare results:

1 : 0
1 : 2
3 : 1
2 : 1
3 : 2
1 : 0
1 : 2
3 : 1
2 : 1
3 : 2
0 : 0
1 : 0
1 : 1
2 : 0
2 : 1
2 : 2
3 : 0
3 : 1
3 : 2
3 : 3

I asked another question about this as it was getting out of the scope of an answer.

查看更多
祖国的老花朵
3楼-- · 2019-06-21 06:00

I think you missed the return value section.

Returns an array containing all the entries from array1 that are not present in any of the other arrays.

the array keys are used in the comparison.

What is missing in the text is that the comparison is only done associatively. This means that any automatically declared or user defined numerical keys are typed as strings not integers.

So with

$one = array(a,b,c,'hot'=>d); // d has no match and  will be returned as array and go to the function alone
$two = array(a,b,c,d,e,f); //

Because $one hot=>d does not match $two 0=>d on an associative level $one hot=>d is returned.

Because of the PHP quirk of string and integer data type comparisons a user defined function can be used to enhance the comparison by using stronger comparison operations like ===.

This helps in situations where the type is ambiguous '0'=>d and 0=>d might look similar but are not until you say so in your code.

Luckily type hinting is coming to PHP7 to rid us of this type of weird construct and unclear documentation.

I am adding this from my comment because it pertains to your understanding of which php constructs are best used in your case. My comment:

I am not so sure about that since if($a != $b) { in their code is a problem. Since they are mistakenly using equality when they should be using identical operators !==. And they are using numerical keys in a construct designed for associative keys. they are probably also unaware of array_udiff which a better match for the data involved

查看更多
地球回转人心会变
4楼-- · 2019-06-21 06:00

This is somewhat intriguing indeed. I looked up the latest source of PHP on github (which is written in C++ as you probably know) and tried to make sense of it. (https://github.com/php/php-src/blob/master/ext/standard/array.c)

A quick search showed me that the function in question is declared on line 4308

PHP_FUNCTION(array_diff_uassoc)
{
    php_array_diff(INTERNAL_FUNCTION_PARAM_PASSTHRU, DIFF_ASSOC, DIFF_COMP_DATA_INTERNAL, DIFF_COMP_KEY_USER);
}

So that shows that the actual work is done by the php_array_diff function, that can be found in that same file on line 3938. It's a bit long to paste it here, 265 lines to be exact, but you can look it up if you want.

That is the point where I gave up. I have no experience in C whatsoever, and it is to late and I'm to tired to try and make sense of it. I suppose key comparison is done first, as it is probably more performant then comparing the values, but that is just a guess. Anyway, there is probably a good reason why they do it the way they do.

All that is just a long introduction to say, why would you want to put an echo inside your compare function in the first place? The goal of array_diff_uassoc is the output of the function. You should not rely on how the parser handles it. If they decide tomorrow to change the internal workings of that C function to ie. do the value comparison first, you'll get an entirely different result.

Perhaps you could use this replacement function that is written in php: http://pear.php.net/reference/PHP_Compat-1.6.0a2/__filesource/fsource_PHP_Compat__PHP_Compat-1.6.0a2CompatFunctionarray_diff_uassoc.php.html

That way you can rely on the behaviour to not change, and you have full control of the internal workings...

查看更多
登录 后发表回答