PHP Reference in array key

2019-09-16 17:11发布

问题:

PHP:

$a = array("key" => 23);
var_dump($a);

$c = &$a["key"];
var_dump($a);

unset($c);
var_dump($a);

Output:

array(1) {
  ["key"]=>
  int(23)
}
array(1) {
  ["key"]=>
  &int(23)
}
array(1) {
  ["key"]=>
  int(23)
}

In the second dump the value of "key" is shown as a reference. Why is that? If I do the same with a normal variable instead of an array key this does not happen.

My only explanation would be that array keys are usually stored as references and as long as there is only one entry in the symbol table it is shown as a scalar in the dump.

回答1:

Internally, PHP arrays are hashmaps (or dictionaries, or HashTables or whatever you want to call it). Even a numerically indexed array is implemented as a hash table, which is a zval, just like any other.
However, what you're seeing is expected behaviour, which is explained both here and here.

Basically, what your array looks like internally is this:

typedef struct _zval_struct {
    zvalue_value value;
    zend_uint refcount__gc;
    zend_uchar type;
    zend_uchar is_ref__gc;
} zval;
//zval_value:
typedef union _zvalue_value {
    long lval;
    double dval;
    struct {
        char *val;
        int len;
    } str;
    HashTable *ht;
    zend_object_value obj;
} zvalue_value;

In case of an array, the zval.type will be set to indicate that the zval value is an array, and so the zval_value.ht member will be used.
What happens when you write $c = &$a['key'] is that the zval that is assigned to $a['key'] will be updated: zval.refcount__gc will be incremented, and is_ref__gc will be set to 1. Simply because the value is not copied, but the value is used by more than 1 variable: meaning this value is a reference. Once you unset($c);, the refcount is decremented, and the reference is lost, and so is_ref is set to 0.

Now for the big one: Why don't you see the same thing when you use regular, scalar variables? Well, that's because an array is a HashTable, complete with its own, internal, ref-counting (zval_ptr_dtor). Once an array itself is empty, it too should be destroyed. By creating a reference to an array value, and you unset the array, the zval should be GC'ed. But that would mean you have a reference to a destroyed zval floating around.
Therefore, the zval in the array is changed to a reference, too: a reference can be deleted safely. So that if you were to do this:

$foo = array(123);
$bar = &$foo[0];
unset($foo[0]);
echo $bar, PHP_EOL;

Your code will still work as expected: $foo[0] no longer exists, but $bar is now the only existing reference to 123.

This is just a really, really, short and incomplete explanation, but google the PHP internals, and how the memory management works, how references are dealt with internally, and how the garbage collector uses the is_ref and refcount members to manage the memory.
Pay special attention to the internal mechanisms like copy-on-write, and (when looking through the first link I provided here), look for the snippet that looks like this:

$ref = &$array;
foreach ($ref as $val) {}

Because it deals with some oddities in terms of references and arrays.