Hey there. Today I wrote a small benchmark script to compare performance of copying variables vs. creating references to them. I was expecting, that creating references to large arrays for example would be significantly slower than copying the whole array. Here is my benchmark code:
<?php
$array = array();
for($i=0; $i<100000; $i++) {
$array[] = mt_rand();
}
function recursiveCopy($array, $count) {
if($count === 1000)
return;
$foo = $array;
recursiveCopy($array, $count+1);
}
function recursiveReference($array, $count) {
if($count === 1000)
return;
$foo = &$array;
recursiveReference($array, $count+1);
}
$time = microtime(1);
recursiveCopy($array, 0);
$copyTime = (microtime(1) - $time);
echo "Took " . $copyTime . "s \n";
$time = microtime(1);
recursiveReference($array, 0);
$referenceTime = (microtime(1) - $time);
echo "Took " . $referenceTime . "s \n";
echo "Reference / Copy: " . ($referenceTime / $copyTime);
The actual result I got was, that recursiveReference took about 20 times (!) as long as recursiveCopy.
Can somebody explain this PHP behaviour?
PHP will very likely implement copy-on-write for its arrays, meaning when you "copy" an array, PHP doesn't do all the work of physically copying the memory until you modify one of the copies and your variables can no longer reference the same internal representation.
Your benchmarking is therefore fundamentally flawed, as your
recursiveCopy
function doesn't actually copy the object; if it did, you would run out of memory very quickly.Try this: By assigning to an element of the array you force PHP to actually make a copy. You'll find you run out of memory pretty quickly as none of the copies go out of scope (and aren't garbage collected) until the recursive function reaches its maximum depth.
You don't need to (and thus shouldn't) assign or pass variables by reference just for performance reasons. PHP does such optimizations automatically.
The test you ran is flawed because of these automatic optimizations. In ran the following test instead:
This was the output:
As you can see there is no significant performance difference in assigning by reference until you actually write to the copy, i.e. when there is also a functional difference.
Generally speaking in PHP, calling by reference is not something you'd do for performance reasons; it's something you'd do for functional reasons - ie because you actually want the referenced variable to be updated.
If you don't have a functional reason for calling by reference then you should stick with regular parameter passing, because PHP handles things perfectly efficiently that way.
(that said, as others have pointed out, your example code isn't exactly doing what you think it is anyway ;))
recursiveReference is calling recursiveCopy. Not that that would necessarily harm performance, but that's probably not what you're trying to do.
Not sure why performance is slower, but it doesn't reflect the measurement you're trying to make.
in recursiveReference you're calling recursiveCopy... this doesn't make any sense, in this case you're calling recursiveReference just once. correct your code, rund the benchmark again and come back with your new results.
in addition, i don't think it's useful for a benchmark to do this recursively. a better solution would be to call a function 1000 times in a loop - once with the array directly and one with a reference to that array.