I recently found a bug in my application caused by unexpected behaviour of array_merge_recursive
. Let's take a look at this simple example:
$array1 = [
1 => [
1 => 100,
2 => 200,
],
2 => [
3 => 1000,
],
3 => [
1 => 500
]
];
$array2 = [
3 => [
1 => 500
]
];
array_merge_recursive($array1, $array2);
//returns: array:4 [ 0 => //...
I expected to get an array with 3 elements: keys 1, 2, and 3. However, the function returns an array with keys 0, 1, 2 and 3. So 4 elements, while I expected only 3. When I replace the numbers by their alphabetical equivalents (a, b, c) it returns an array with only 3 elements: a, b and c.
$array1 = [
'a' => [
1 => 100,
2 => 200,
],
'b' => [
3 => 1000,
],
'c' => [
1 => 500
]
];
$array2 = [
'c' => [
1 => 500
]
];
array_merge_recursive($array1, $array2);
//returns: array:3 [ 'a' => //...
This is (to me at least) unexpected behaviour, but at least it's documented:
http://php.net/manual/en/function.array-merge-recursive.php
If the input arrays have the same string keys, then the values for these keys are merged together into an array, and this is done recursively, so that if one of the values is an array itself, the function will merge it with a corresponding entry in another array too. If, however, the arrays have the same numeric key, the later value will not overwrite the original value, but will be appended.
The documentation isn't very clear about what 'appended' means. It turns out that elements of $array1
with a numeric key will be treated as indexed elements, so they'll lose there current key: the returned array starts with 0. This will lead to strange outcome when using both numeric and string keys in an array, but let's not blame PHP if you're using a bad practice like that. In my case, the problem was solved by using array_replace_recursive
instead, which did the expected trick. ('replace' in that function means replace if exist, append otherwise; naming functions is hard!)
Question 1: recursive or not?
But that's not were this question ends. I thought array_*_resursive
would be a recursive function:
Recursion is a kind of function call in which a function calls itself. Such functions are also called recursive functions. Structural recursion is a method of problem solving where the solution to a problem depends on solutions to smaller instances of the same problem.
It turns out it isn't. While $array1
and $array2
are associative arrays, both $array1['c']
and $array2['c']
from the example above are indexed arrays with one element: [1 => 500]
. Let's merge them:
array_merge_recursive($array1['c'], $array2['c']);
//output: array:2 [0 => 500, 1 => 500]
This is expected output, because both arrays have a numeric key (1
), so the second will be appended to the first. The new array starts with key 0. But let's get back to the very first example:
array_merge_recursive($array1, $array2);
// output:
// array:3 [
// "a" => array:2 [
// 1 => 100
// 2 => 200
// ]
// "b" => array:1 [
// 3 => 1000
// ]
// "c" => array:2 [
// 1 => 500 //<-- why not 0 => 500?
// 2 => 500
// ]
//]
$array2['c'][1]
is appended to $array1['c']
but it has keys 1 and 2. Not 0 and 1 in the previous example. The main array and it's sub-arrays are treated differently when handling integer keys.
Question 2: String or integer key makes a big difference.
While writing this question, I found something else. It's getting more confusing when replacing the numeric key with a string key in a sub-array:
$array1 = [
'c' => [
'a' => 500
]
];
$array2 = [
'c' => [
'a' => 500
]
];
array_merge_recursive($array1, $array2);
// output:
// array:1 [
// "c" => array:1 [
// "a" => array:2 [
// 0 => 500
// 1 => 500
// ]
// ]
//]
So using a string key will cast (int) 500
into array(500)
, while using a integer key won't.
Can someone explain this behaviour?