I think I'm going insane with this.
I have a a piece of code that needs to create an (unsigned) integer with N
consequent bits set to 1. To be exact I have a bitmask, and in some situations I'd like to set it to a solid rnage.
I have the following function:
void MaskAddRange(UINT& mask, UINT first, UINT count)
{
mask |= ((1 << count) - 1) << first;
}
In simple words: 1 << count
in binary representation is 100...000
(number of zeroes is count
), subtracting 1 from such a number gives 011...111
, and then we just left-shift it by first
.
The above should yield correct result, when the following obvious limitation is met:
first + count <= sizeof(UINT)*8 = 32
Note that it should also work correctly for "extreme" cases.
- if
count = 0
we have(1 << count) = 1
, and hence((1 << count) - 1) = 0
. - if
count = 32
we have(1 << count) = 0
, since the leading bit overflows, and according to C/C++ rules bitwise shift operators are not cyclic. Then((1 << count) - 1) = -1
(all bits set).
However, as turned out, for count = 32
the formula doesn't work as expected. As discovered:
UINT n = 32;
UINT x = 1 << n;
// the value of x is 1
Moreover, I'm using MSVC2005 IDE. When I evaluate the above expression in the debugger, the result is 0. However when I step over the above line, x
gets value of 1. Lokking via the disassembler we see the following:
mov eax,1
mov ecx,dword ptr [ebp-0Ch] // ecx = n
shl eax,cl // eax <<= LOBYTE(ecx)
mov dword ptr [ebp-18h],eax // n = ecx
There's no magic indeed, compiler just used shl
instruction. Then it seems that shl
doesn't do what I expected it should do. Either CPU decides to ignore this instruction, or the shift is treated modulo 32, or donno what.
My questions are:
- What is the correct behavior of
shl
/shr
instructions? - Is there a CPU flag controlling the bitshift instructions?
- Is this according to C/C++ standard?
Thanks in advance
Edit:
Thanks for answers. I've realized that (1) shl
/shr
indeed treat operand modulo 32 (or & 0x1F) and (2) C/C++ standard treats shift by more than 31 bits as undefined behavior.
Then I have one more question. How can I rewrite my "masking" expression to cover this extreme case too. It should be without branching (if
, ?
). What'd be the simplest expression?
You could avoid the undefined behavior by splitting the shift operation in two steps, the first one by (count - 1) bits and the second one by 1 more bit. Special care is needed in case count is zero, however:
If I have understood the requirements, you want an unsigned int, with the top N bits set?
There are several ways to get the result (I think) you want. Edit: I am worried that this isnt very robust, and will fail for n>32:
This should be quite fast as it is only 132 bytes of data.
To make it robust, I'd either extend for all values up to 63, or make it conditional, in which case it can be done with a version of your original bit-masking + the 32 case. I.e.
Shifting by as many or more bits than in the integer type you're shifting is undefined in C and C++. On x86 and x86_64, the shift amount of the shift instructions is indeed treated modulo 32 (or whatever the operand size is). You however cannot rely on this modulo behaviour to be generated by your compiler from C or C++
>>
/<<
operations unless your compiler explicitly guarantees it in its documentation.I think the expression
1 << 32
is the same as1 << 0
. IA-32 Instruction Set Reference says that the count operand of shift instructions is masked to 5 bits.The instruction set reference of IA-32 architectures can be found here.
To fix the "extreme" case, I can only come up with the following code (maybe buggy) that may be a little awkward:
The basic idea is to split the shift operation so that each shift count does not exceed 31. Apparently, the above code assumes that the count is in a range of 0..32, so it is not very robust.
My 32 cents:
List 1.
A safe version with bogus / semi-circular result could be:
List 2.
Doing this without branching, or local variables, could be something like;
List 3.
List 2 and List 3 This would give "correct" result as long as
from
is less thenINT_BIT
and >= 0. I.e.:1U << 32
is undefined behavior in C and in C++ when typeunsigned int
is 32-bit wide.