I have disassembled code produced by the compiler, and I see that it has produced the following sequence of instructions:
mov eax, edx
shr eax, 1Fh
add eax, edx
sar eax, 1
What is the purpose of this code?
I know that
sar eax, 1
divides by 2, but what does
shr eax, 1Fh
do? Does this mean that EAX
will be either 0 or 1 if the left bit was either 0 or 1?
This looks strange to me! Can someone explain it?
The quick answer to your question—what is shr eax, 1Fh
—is that it serves to isolate the uppermost bit of eax
. It might be easier to understand if you convert the hexadecimal 1Fh
to decimal 31
. Now, you see that you're shifting eax
right by 31. Since eax
is a 32-bit value, shifting its bits right by 31 will isolate the very top bit, such that eax
will contain either 0 or 1, depending on what the original value was of bit 31 (assuming that we start numbering bits with 0).
This is a common trick for isolating the sign bit. When a value is interpreted as a signed integer on a two's-complement machine, the uppermost bit is the sign bit. It is set (== 1) if the value is negative, or clear (== 0) otherwise. Of course, if the value is interpreted as an unsigned integer, the uppermost bit is just another bit used for storing its value, so the uppermost bit has an arbitrary value.
Going line by line through the disassembly, here's what the code does:
mov eax, edx
Evidently, the input was in EDX
. This instruction copies the value from EDX
into EAX
. This allows subsequent code to manipulate the value in EAX
without losing the original (in EDX
).
shr eax, 1Fh
Shift EAX
right by 31 places, thus isolating the uppermost bit. Assuming that the input value is a signed integer, this will be the sign bit. EAX
will now contain 1 if the original value was negative, or 0 otherwise.
add eax, edx
Add the original value (EDX
) to our temporary value in EAX
. If the original value was negative, this will add 1 to it. Otherwise, it will add 0.
sar eax, 1
Shift EAX
right by 1 place. The difference here is that this is an arithmetic right shift, whereas SHR
is a logical right shift. A logical shift fills the newly-exposed bits with 0s. An arithmetic shift copies the uppermost bit (the sign bit) to the newly-exposed bit.
Putting it all together, this is a standard idiom for dividing a signed integer value by 2 to ensure that negative values are correctly rounded.
When you divide an unsigned value by 2, a simple bit-shift is all that is required. Thus:
unsigned Foo(unsigned value)
{
return (value / 2);
}
is equivalent to:
shr eax, 1
But when dividing a signed value, you must deal with the sign bit. You could use sar eax, 1
to implement a signed integer division by 2, but this will cause the resulting value to be rounded toward negative infinity. Note that that is different than the behavior of the DIV
/IDIV
instruction, which always rounds towards zero. If you want to emulate the round-towards-zero behavior, you need some special handling, which is precisely what the code you have does. In fact, GCC, Clang, MSVC, and probably every other compiler will all generate precisely this code when you compile the following function:
int Foo(int value)
{
return (value / 2);
}
This is a very old trick. Michael Abrash discussed it in his Zen of Assembly Language, published circa 1990. (Here is the relevant section in an online copy of his book.) It was surely common knowledge among assembly-language gurus long before that.