Matlab tic toc accuracy

2019-04-19 10:42发布

问题:

I'm measuring some code in loop

fps = zeros(1, 100);
for i=1:100

    t = tic
    I = fetch_image_from_source(); % function to get image
    fps(i) = 1./ toc(t);

end
plot(fps);

And I get average 50 fps.

Then I'd like to add imshow() to my code. I understand that imshow is very slow, but I won't include imshow inside tic-toc commands:

fps = zeros(1, 100);
figure;
for i=1:100

    t = tic
    I = fetch_image_from_source(); % function to get image
    fps(i) = 1./ toc(t);

    imshow(I); drawnow;

end
plot(fps);

And I get fps about 20%-30% slower. Why does it happen? Because imshow() is outside tic-toc

回答1:

Here is a matlab's doc about time in general and how elapsed time was and is currently measured in matlab. We can read that "tic and toc [offers] the highest accuracy and most predictable behavior". I think it is valid statement.

The drop of performance observed here is not due to a bad measure of elapsed time, and not related either to the use of imshow or drawnow functions. I will argue that it is related to a cache system.

The figure below displays the results of four tests, each of them having its own tic/toc baseline measure (plotted in blue) for 100 iterations. The green line shows the performance in different conditions:

(1)    for ii=1:100
         t = tic;                %single tic/toc
         fps(ii,2) = 1./toc(t); 
         rand(1000);             %extra function outside tic/toc
       end

As reported in your question, we can observe a slower frame per second (FPS; I would say 30%) despite rand being outside of the tic/toc block. The extra function can be of any type (plot, surf, imshow, sum), you will always observe a performance drop.

(2)    for ii=1:100
         t = tic;                %first tic/toc
         fps(ii,2) = 1./toc(t); 
         t = tic;                %second tic/toc
         fps(ii,2) = 1./toc(t);
         rand(1000);             %extra function outside tic/toc
       end

In the second subplot, the tic/toc block is repeated twice. The fps measurement is therefore executed two times and only the second measure is kept. We see that the performance drop is not there anymore - just like the first tic/toc call prepared the second one (warm-up). I interpret this in term of cache: the instructions and/or data are executed and then kept in a low level memory - the second call is faster.

(3)    for ii=1:100
         t = tic;                     %first tic/toc
         fps(ii,2) = 1./toc(t);
         for ij = 1:10000             %10,000 extra tic/toc
           tic;
           tmp = toc;
         end
       end

The third subplot used 10,000 tic/toc as an extra function in a single call scenario. You can see the the performance is nearly identical. The whole set of data/instructions in this subplot is only related to tic/toc - again, with a fast cache access.

(4)    for ii=1:100               %first tic/toc block
         t = tic;   
         fps(ii,1) = 1./toc(t);
       end
       for ii=1:100               %second tic/toc block
         t = tic;   
         fps(ii,2) = 1./toc(t);
       end

Finally, the fourth subplot shows two consecutive block of tic/toc calls. We can see that the second one performs better than the first one (a warm-up effect).

The overall pattern shown here is not related to imshow, does not depend on JIT of accel, but depends only on successive calls to a particular function. I interpret this in terms of cache, but I lack some kind of formal evidence.

Here are the plots

and the code

%% EXTRA FUNCTION (single call)
fps = zeros(2, 100);

% first case: 100 tic/toc
for ii=1:100
    t = tic;   
    fps(ii,1) = 1./toc(t);
end

%second case: 100 tic/toc + additional function
for ii=1:100

    t = tic;   
    fps(ii,2) = 1./toc(t);

    % graph or scalar functions (uncomment to test)
    %drawnow;
    %plot(1:10)
    rand(1000);          
    %ones(1000, 1000);
    %sum(1:1000000);
    %diff(1:1000000);
end


h = figure('Color','w','Position',[10 10 600 800]);

subplot(4,1,1);
plot(fps); legend({'tic/toc only','extra function'});
ylabel('FPS');
title('extra function, single call','FontSize',14);
set(gca,'FontSize',14, 'YLim', [0 3.5e5]);

%% EXTRA FUNCTION (double call)
fps = zeros(2, 100);

% first case: 100 tic/toc
for ii=1:100
    t = tic;   
    fps(ii,1) = 1./toc(t);
end

%second case: 100 tic/toc + additional function (except tic/toc)
for ii=1:100

    %first call
    t = tic;   
    fps(ii,2) = 1./toc(t);

    %second call (identical to first)
    t = tic;   
    fps(ii,2) = 1./toc(t);

    rand(1000);
end

subplot(4,1,2);
plot(fps); legend({'tic/toc only','extra function'});
ylabel('FPS');
title('extra function, double call','FontSize',14);
set(gca,'FontSize',14, 'YLim', [0 3.5e5]);


%% EXTRA FUNCTION (double call)
fps = zeros(2, 100);

% first case: 100 tic/toc
for ii=1:100
    t = tic;   
    fps(ii,1) = 1./toc(t);
end

%second case: 100 tic/toc + 10000 tic/toc
for ii=1:100

    t = tic;   
    fps(ii,2) = 1./toc(t);

    for ij = 1:10000
        tic;
        tmp = toc;
    end

end


subplot(4,1,3);
plot(fps); legend({'tic/toc','extra tic/toc'});
ylabel('FPS');
title('Identical function calls','FontSize',14);
set(gca,'FontSize',14, 'YLim', [0 3.5e5]);


%% TIC/TOC call twice
fps = zeros(2, 100);

% first case: 100 tic/toc
for ii=1:100
    t = tic;   
    fps(ii,1) = 1./toc(t);
end

for ii=1:100
    t = tic;   
    fps(ii,2) = 1./toc(t);
end

subplot(4,1,4);
plot(fps); legend({'tic/toc (1)','tic/toc (2)'});
ylabel('FPS');
title('tic/toc twice','FontSize',14);
set(gca,'FontSize',14, 'YLim', [0 3.5e5]);


回答2:

This could be due to your processor's multi-threading capability.

The number of computational threads used by MATLAB is based on the value of maxNumCompThreads. If you set this to 1, then both cases should theoretically yield the same fps.

You can do achieve this as:

LASTN = maxNumCompThreads(N);

Here N ought to be 1 and LASTN will give you the previous maximum number of computational threads, which may be useful later in case you want to reset the preference.