Would you use num%2 or num&1 to check if a number

2019-01-26 04:16发布

Well, there are at least two low-level ways of determining whether a given number is even or not:

 1. if (num%2 == 0) { /* even */ } 
 2. if ((num&1) == 0) { /* even */ }

I consider the second option to be far more elegant and meaningful, and that's the one I usually use. But it is not only a matter of taste; The actual performance may vary: usually the bitwise operations (such as the logial-and here) are far more efficient than a mod (or div) operation. Of course, you may argue that some compilers will be able to optimize it anyway, and I agree...but some won't.

Another point is that the second one might be a little harder to comprehend for less experienced programmers. On that I'd answer that it will probably only benefit everybody if these programmers take that short time to understand statements of this kind.

What do you think?

The given two snippets are correct only if num is either an unsigned int, or a negative number with a two's complement representation. - As some comments righfuly state.

12条回答
别忘想泡老子
2楼-- · 2019-01-26 04:25

I code for readability first so my choice here is num % 2 == 0. This is far more clear than num & 1 == 0. I'll let the compiler worry about optimizing for me and only adjust if profiling shows this to be a bottleneck. Anything else is premature.

I consider the second option to be far more elegant and meaningful

I strongly disagree with this. A number is even because its congruency modulo two is zero, not because its binary representation ends with a certain bit. Binary representations are an implementation detail. Relying on implementation details is generally a code smell. As others have pointed out, testing the LSB fails on machines that use ones' complement representations.

Another point is that the second one might be a little harder to comprehend for less experienced programmers. On that I'd answer that it will probably only benefit everybody if these programmers take that short time to understand statements of this kind.

I disagree. We should all be coding to make our intent clearer. If we are testing for evenness the code should express that (and a comment should be unnecessary). Again, testing congruency modulo two more clearly expresses the intent of the code than checking the LSB.

And, more importantly, the details should be hidden away in an isEven method. So we should see if(isEven(someNumber)) { // details } and only see num % 2 == 0 once in the definition of isEven.

查看更多
SAY GOODBYE
3楼-- · 2019-01-26 04:29

At this point, I may be just adding to the noise, but as far as readability goes, the modulo option makes more sense. If your code is not readable, it's practically useless.

Also, unless this is code to be run on a system that's really strapped for resources (I'm thinking microcontroller), don't try to optimize for the compiler's optimizer.

贪生不怕死
4楼-- · 2019-01-26 04:30

If you're going to say that some compilers won't optimise %2, then you should also note that some compilers use a ones' complement representation for signed integers. In that representation, &1 gives the wrong answer for negative numbers.

So what do you want - code which is slow on "some compilers", or code which is wrong on "some compilers"? Not necessarily the same compilers in each case, but both kinds are extremely rare.

Of course if num is of an unsigned type, or one of the C99 fixed-width integer types (int8_t and so on, which are required to be 2's complement), then this isn't an issue. In that case, I consider %2 to be more elegant and meaningful, and &1 to be a hack that might conceivably be necessary sometimes for performance. I think for example that CPython doesn't do this optimisation, and the same will be true of fully interpreted languages (although then the parsing overhead likely dwarfs the difference between the two machine instructions). I'd be a bit surprised to come across a C or C++ compiler that didn't do it where possible, though, because it's a no-brainer at the point of emitting instructions if not before.

In general, I would say that in C++ you are completely at the mercy of the compiler's ability to optimise. Standard containers and algorithms have n levels of indirection, most of which disappears when the compiler has finished inlining and optimising. A decent C++ compiler can handle arithmetic with constant values before breakfast, and a non-decent C++ compiler will produce rubbish code no matter what you do.

查看更多
ら.Afraid
5楼-- · 2019-01-26 04:31

It's been mentioned a number of times that any modern compiler would create the same assembly for both options. This reminded me of the LLVM demo page that I saw somewhere the other day, so I figured I'd give it a go. I know this isn't much more than anecdotal, but it does confirm what we'd expect: x%2 and x&1 are implemented identically.

I also tried compiling both of these with gcc-4.2.1 (gcc -S foo.c) and the resultant assembly is identical (and pasted at the bottom of this answer).

Program the first:

int main(int argc, char **argv) {
  return (argc%2==0) ? 0 : 1;
}

Result:

; ModuleID = '/tmp/webcompile/_27244_0.bc'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-linux-gnu"

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind readnone {
entry:
    %0 = and i32 %argc, 1       ; <i32> [#uses=1]
    ret i32 %0
}

Program the second:

int main(int argc, char **argv) {
  return ((argc&1)==0) ? 0 : 1;
}

Result:

; ModuleID = '/tmp/webcompile/_27375_0.bc'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-linux-gnu"

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind readnone {
entry:
    %0 = and i32 %argc, 1       ; <i32> [#uses=1]
    ret i32 %0
}

GCC output:

.text
.globl _main
_main:
LFB2:
  pushq %rbp
LCFI0:
  movq  %rsp, %rbp
LCFI1:
  movl  %edi, -4(%rbp)
  movq  %rsi, -16(%rbp)
  movl  -4(%rbp), %eax
  andl  $1, %eax
  testl %eax, %eax
  setne %al
  movzbl  %al, %eax
  leave
  ret
LFE2:
  .section __TEXT,__eh_frame,coalesced,no_toc+strip_static_syms+live_support
EH_frame1:
  .set L$set$0,LECIE1-LSCIE1
  .long L$set$0
LSCIE1:
  .long 0x0
  .byte 0x1
  .ascii "zR\0"
  .byte 0x1
  .byte 0x78
  .byte 0x10
  .byte 0x1
  .byte 0x10
  .byte 0xc
  .byte 0x7
  .byte 0x8
  .byte 0x90
  .byte 0x1
  .align 3
LECIE1:
.globl _main.eh
_main.eh:
LSFDE1:
  .set L$set$1,LEFDE1-LASFDE1
  .long L$set$1
ASFDE1:
  .long LASFDE1-EH_frame1
  .quad LFB2-.
  .set L$set$2,LFE2-LFB2
  .quad L$set$2
  .byte 0x0
  .byte 0x4
  .set L$set$3,LCFI0-LFB2
  .long L$set$3
  .byte 0xe
  .byte 0x10
  .byte 0x86
  .byte 0x2
  .byte 0x4
  .set L$set$4,LCFI1-LCFI0
  .long L$set$4
  .byte 0xd
  .byte 0x6
  .align 3
LEFDE1:
  .subsections_via_symbols
查看更多
神经病院院长
6楼-- · 2019-01-26 04:36

I spent years insisting that any reasonable compiler worth the space it consumes on disk would optimize num % 2 == 0 to num & 1 == 0. Then, analyzing disassembly for a different reason, I had a chance to actually verify my assumption.

It turns out, I was wrong. Microsoft Visual Studio, all the way up through version 2013, generates the following object code for num % 2 == 0:

    and ecx, -2147483647        ; the parameter was passed in ECX
    jns SHORT $IsEven
    dec ecx
    or  ecx, -2
    inc ecx
$IsEven:
    neg ecx
    sbb ecx, ecx
    lea eax, DWORD PTR [ecx+1]

Yes, indeed. This is in Release mode, with all optimizations enabled. You get virtually equivalent results whether building for x86 or x64. You probably won't believe me; I barely believed it myself.

It does essentially what you would expect for num & 1 == 0:

not  eax                        ; the parameter was passed in EAX
and  eax, 1

By way of comparison, GCC (as far back as v4.4) and Clang (as far back as v3.2) do what one would expect, generating identical object code for both variants. However, according to Matt Godbolt's interactive compiler, ICC 13.0.1 also defies my expectations.

Sure, these compilers are not wrong. It's not a bug. There are plenty of technical reasons (as adequately pointed out in the other answers) why these two snippets of code are not identical. And there's certainly a "premature optimization is evil" argument to be made here. Granted, there's a reason it took me years to notice this, and even then I only stumbled onto it by mistake.

But, like Doug T. said, it is probably best to define an IsEven function in your library somewhere that gets all of these little details correct so that you never have to think about them again—and keep your code readable. If you regularly target MSVC, perhaps you'll define this function as I've done:

bool IsEven(int value)
{
    const bool result = (num & 1) == 0;
    assert(result == ((num % 2) == 0));
    return result;   
}
查看更多
成全新的幸福
7楼-- · 2019-01-26 04:38

Both approaches are not obvious especially to someone who is new to programming. You should define an inline function with a descriptive name. The approach you use in it won't matter (micro optimizations most likely won't make your program faster in a noticeable way).

Anyway, I believe 2) is much faster as it doesn't require a division.

登录 后发表回答