Performance of array of functions over if and swit

2019-01-18 11:19发布

I am writing a very performance critical part of the code and I had this crazy idea about substituting case statements (or if statements) with array of function pointers.

Let me demonstrate; here goes the normal version:

while(statement)
{
    /* 'option' changes on every iteration */

    switch(option)
    {
        case 0: /* simple task */ break;
        case 1: /* simple task */ break;
        case 2: /* simple task */ break;
        case 3: /* simple task */ break;
    }
}

And here is the "callback function" version:

void task0(void) {
    /* simple task */
}

void task1(void) {
    /* simple task */
}

void task2(void) {
    /* simple task */
}

void task3(void) {
    /* simple task */
}

void (*task[4]) (void);

task[0] = task0;
task[1] = task1;
task[2] = task2;
task[3] = task3;

while(statement)
{
    /* 'option' changes on every iteration */

    /* and now we call the function with 'case' number */
    (*task[option]) ();
}

So which version will be faster? Is the overhead of the function call eliminating speed benefit over normal switch (or if) statement?

Ofcourse the latter version is not so readable but I am looking for all the speed I can get.

I am about to benchmark this when I get things set up but if someone has an answer already, I wont bother.

6条回答
你好瞎i
2楼-- · 2019-01-18 11:39

I think at the end of the day your switch statements will be the fastest, because function pointers have the "overhead" of the lookup of the function and the function call itself. A switch is just a jmp table straight. It of course depends on different things which only testing can give you an answer to. That's my two cent worth.

查看更多
Evening l夕情丶
3楼-- · 2019-01-18 11:43

Which version will be faster depends. The naive implementation of switch is a huge if ... else if ... else if ... construction meaning it takes on average O(n) time to execute where n is the number of cases. Your jump table is O(1) so the more different cases there are and the more the later cases are used, the more likely the jump table is to be better. For a small number of cases or for switches where the first case is chosen more frequently than others, the naive implementation is better. The matter is complicated by the fact that the compiler may choose to use a jump table even when you have written a switch if it thinks that will be faster.

The only way to know which you should choose is to performance test your code.

查看更多
走好不送
4楼-- · 2019-01-18 11:44

I arrived at this post recently since I was wondering the same. I ended up taking the time to try it. It certainly depends greatly on what you're doing, but for my VM it was a decent speed up (15-25%), and allowed me to simplify some code (which is probably where a lot of the speedup came from). As an example (code simplified for clarity), a "for" loop was able to be easily implemented using a for loop:

void OpFor( Frame* frame, Instruction* &code )
{
  i32 start = GET_OP_A(code);
  i32 stop_value = GET_OP_B(code);
  i32 step = GET_OP_C(code);
  // instruction count (ie. block size)
  u32 i_count = GET_OP_D(code);
  // pointer to end of block (NOP if it branches)
  Instruction* end = code + i_count;

  if( step > 0 )
  {
    for( u32 i = start; i < stop_value; i += step )
    {
      // rewind instruction pointer
      Instruction* cur = code;

      // execute code inside for loop
      while(cur != end)
      {
        cur->func(frame, cur);
        ++cur;
      }
    }
  }
  else
    // same with <=
}
查看更多
The star\"
5楼-- · 2019-01-18 11:45

The switch statement should be compiled into a branch table, which is essentially the same thing as your array of functions, if your compiler has at least basic optimization capability.

查看更多
倾城 Initia
6楼-- · 2019-01-18 11:49

First, I would randomly-pause it a few times, to make certain enough time is spent in this dispatching to even bother optimizing it.

Second, if it is, since each branch spends very few cycles, you want a jump table to get to the desired branch. The reason switch statements exist is to suggest to the compiler that it can generate one if the switch values are compact.

How long is the list of switch values? If it's short, the if-ladder could still be faster, especially if you put the most frequently used codes at the top. An alternative to an if-ladder (that I've never actually seen anyone use) is an if-tree, the code equivalent of a binary tree.

You probably don't want an array of function pointers. Yes, it's an array reference to get the function pointer, but there's several instructions' overhead in calling a function, and it sounds like that could overwhelm the small amount being done inside each function.

In any case, looking at the assembly language, or single-stepping at the instruction level, will give you a good idea how efficient it's being.

查看更多
劫难
7楼-- · 2019-01-18 11:51

A good compiler will compile a switch with cases in a small numerical range as a single conditional to see if the value is in that range (which can sometimes be optimized out) followed by a jumptable jump. This will almost surely be faster than a function call (direct or indirect) because:

  1. A jump is a lot less expensive than a call (which must save call-clobbered registers, adjust the stack, etc.).
  2. The code in the switch statement cases can make use of expression values already cached in registers in the caller.

It's possible that an extremely advanced compiler could determine that the call-via-function pointer only refers to one of a small set of static-linkage functions, and thereby optimize things heavily, maybe even eliminating the calls and replacing them by jumps. But I wouldn't count on it.

查看更多
登录 后发表回答