Using tic toc for benchmark MATLAB

yesterday I was testing whether using a for loop for adding up elements in and array was worse than using the built-in MATLAB function sum (As far I understand this should be the case, since built-in functions are pre-compiled), however I got some weird results:

r = rand(1e8, 1);

tic
sum1 = 0;
for i = 1:1e8
    sum1 = sum1 + r(i);
end
t1 = toc;

tic
sum2 = sum(r);
t2 = toc;


>> t1

t1 =

    0.5872

>> t2

t2 =

    0.1053

it gives me those results (MATLAB 2011). However I tested it in MATLAB 2013, and using sum was worse than the for loop. I don't know whether I messed up or I missed something?

Which is better? for loop or sum?

function timing_builtins()
% Setup
rng(123);
r = rand(1e8, 1);

    function test1(r)
        sum1 = 0;
        for i = 1:1e8
            sum1 = sum1 + r(i);
        end
    end

    function test2(r)
        sum2 = sum(r);
    end

    t1 = timeit(@() test1(r));
    t2 = timeit(@() test2(r));

    format long g;
    fprintf('For loop: %f seconds\n', t1);
    fprintf('Sum call: %f seconds\n', t2);
end

This should give you a better view of the speedup, it is about 10 times for sum over the for in this specific case.

If we call the predefined variable r instead, we can see that the built-in sum is indeed faster than the for loop, with a factor of 4.

Built-ins are almost always the faster choice. The BLAST engine has been overhauled recently decreasing this factor (I used 2012a), but for the exact same operation a built-in will usually be faster. The same holds for vectorising things using bsxfun, that'll also be (almost) always faster than using loops.

To cite @beaker:

The answer to the more general question is, "it depends." Sometimes loops are faster because they don't do all of the error checking and type conversions that a generalized built-in would.

Thanks to @Daniel and @rayryeng it seems the call to sum is indeed faster still in the newest MATLAB versions. The reason for this is that sum requires almost no checks, since there is not much that can go wrong in summing elements. Also the sum native calls a function from the LAPACK / SuiteSparse interface, which is highly optimised.