Why does a PHP array get modified when it's el

2020-02-07 05:33发布

问题:

When ref-assigning an array's element, the contents of the array are modified:

$arr = array(100, 200);
var_dump($arr);
/* shows:
array(2) {
  [0]=>
  int(100)  // ← ← ← int(100)
  [1]=>
  int(200)
}
*/

$r = &$arr[0];
var_dump($arr);
/* shows:
array(2) {
  [0]=>
  &int(100)  // ← ← ← &int(100)
  [1]=>
  int(200)
}
*/

Live run. (Zend Engine will do fine, while HHVM shows "Process exited with code 153".)

Why is the element modified?

Why do we see &int(100) instead of int(100)?

This seems totally bizarre. What's the explanation for this oddity?

回答1:

I have answered this a while back, but cannot find the answer right now. I believe it went like this:

References are simply "additional" entries in the symbol table for the same value. The symbol table can only have values it points to, not values in values. The symbol table cannot point to an index in an array, it can only point to a value. So when you want to make a reference to an array index, the value at that index is taken out of the array, a symbol is created for it and the slot in the array gets a reference to the value:

$foo = array('bar');

symbol | value
-------+----------------
foo    | array(0 => bar)

$baz =& $foo[0];

symbol | value
-------+----------------
foo    | array(0 => $1)
baz    | $1
$1     | bar              <-- pseudo entry for value that can be referenced

Because this is not possible:

symbol | value
-------+----------------
foo    | array(0 => bar)
baz    | &foo[0]          <-- not supported by symbol table

The $1 above is just an arbitrarily chosen "pseudo" name, it has nothing to do with actual PHP syntax or with how the value is actually referenced internally.

As requested in the comments, here how the symbol table usually behaves with references:

$a = 1;

symbol | value
-------+----------------
a      | 1


$b = 1;

symbol | value
-------+----------------
a      | 1
b      | 1


$c =& a;

symbol | value
-------+----------------
a, c   | 1
b      | 1