Performance wise how fast Are Bitwise operator vs.

2019-01-13 16:12发布

站内文章 / C++

61 0

我命由我不由天

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

if(i++ & 1){
}

vs.

if(i % 2){
}

Can using Bitwise operation in normal flow or conditional statements like for, if etc increase the performance and would it be better to always use them where possible?

回答1:

Unless you're using an ancient compiler, it can already handle this level of conversion on its own. That is to say, a modern compiler can and will implement i % 2 using a bitwise AND instruction, provided it makes sense to do so on the target CPU (which, in fairness, it usually will).

In other words, don't expect to see any difference in performance between these, at least with a reasonably modern compiler with a reasonably competent optimizer. In this case, reasonably has a pretty broad definition too--even quite a few compilers that are decades old can handle this sort of micro-optimization with no difficulty at all.

回答2:

TL;DR Write for semantics first, optimize measured hot-spots second.

At the CPU level, integer modulus and divisions are among the slowest operations. But you are not writing at the CPU level, instead you write in C++, which your compiler translates to an Intermediate Representation, which finally is translated into assembly according to the model of CPU for which you are compiling.

In this process, the compiler will apply Peephole Optimizations, among which figure Strength Reduction Optimizations such as (courtesy of Wikipedia):

Original Calculation  Replacement Calculation
y = x / 8             y = x >> 3
y = x * 64            y = x << 6
y = x * 2             y = x << 1
y = x * 15            y = (x << 4) - x

The last example is perhaps the most interesting one. Whilst multiplying or dividing by powers of 2 is easily converted (manually) into bit-shifts operations, the compiler is generally taught to perform even smarter transformations that you would probably think about on your own and who are not as easily recognized (at the very least, I do not personally immediately recognize that (x << 4) - x means x * 15).

回答3:

This is obviously CPU dependendent, but you can expect that bitwise operations will never take more, and typically take less, CPU cycles to complete. In general, integer / and % are famously slow, as CPU instructions go. That said, with modern CPU pipelines having a specific instruction complete earlier doesn't mean your program necessarily runs faster.

Best practice is to write code that's understandable and maintainable - expressive of the logic it implements. It's extremely rare that this kind of micro-optimisation makes a tangible difference, so it should only be used if profiling has indicated a critical bottleneck and this is proven to make a significant difference. Moreover, if on some specific platform it did make a significant difference, your compiler optimiser may already be substituting a bitwise operation when it can see that's equivalent.

回答4:

By default you should use the operation that best expresses your intended meaning, because you should optimize for readable code. (Today most of the time the scarcest resource is the human programmer.)

So use & if you extract bits, and use % if you test for divisibility, i.e. whether the value is even or odd.

For unsigned values both operations have exactly the same effect, and your compiler should be smart enough to replace the division by the corresponding bit operation. If you are worried you can check the assembly code it generates.

Unfortunately integer division is slightly irregular on signed values, as it rounds towards zero and the result of % changes sign depending on the first operand. Bit operations, on the other hand, always round down. So the compiler cannot just replace the division by a simple bit operation. Instead it may either call a routine for integer division, or replace it with bit operations with additional logic to handle the irregularity. This may depends on the optimization level and on which of the operands are constants.

This irregularity at zero may even be a bad thing, because it is a nonlinearity. For example, I recently had a case where we used division on signed values from an ADC, which had to be very fast on an ARM Cortex M0. In this case it was better to replace it with a right shift, both for performance and to get rid of the nonlinearity.

回答5:

C operators cannot be meaningfully compared in therms of "performance". There's no such thing as "faster" or "slower" operators at language level. Only the resultant compiled machine code can be analyzed for performance. In your specific example the resultant machine code will normally be exactly the same (if we ignore the fact that the first condition includes a postfix increment for some reason), meaning that there won't be any difference in performance whatsoever.

回答6:

Here is the compiler (GCC 4.6) generated optimized -O3 code for both options:

int i = 34567;
int opt1 = i++ & 1;
int opt2 = i % 2;

Generated code for opt1:

l     %r1,520(%r11)
nilf  %r1,1
st    %r1,516(%r11)
asi   520(%r11),1

Generated code for opt2:

l     %r1,520(%r11)
nilf  %r1,2147483649
ltr   %r1,%r1
jhe  .L14
ahi   %r1,-1
oilf  %r1,4294967294
ahi   %r1,1
.L14: st %r1,512(%r11)

So 4 extra instructions...which are nothing for a prod environment. This would be a premature optimization and just introduce complexity

回答7:

Bitwise operations are much faster. This is why the compiler will do this trick (and many other)for you. Actually, I think it will be faster to implement it as

(! i & 1)

Similarly, if you look at the assembly code your compiler generates, you may see things like x ^= x instead of x=0. but (i hope) you are not going to use this in your c++ code...

To make it short, do yourself, and whoever will need to maintain your code, a favor, Write it simple, and let the compiler do this micro optimizations. It will do it better.