I think I'm going insane with this.
I have a a piece of code that needs to create an (unsigned) integer with N
consequent bits set to 1. To be exact I have a bitmask, and in some situations I'd like to set it to a solid rnage.
I have the following function:
void MaskAddRange(UINT& mask, UINT first, UINT count)
{
mask |= ((1 << count) - 1) << first;
}
In simple words: 1 << count
in binary representation is 100...000
(number of zeroes is count
), subtracting 1 from such a number gives 011...111
, and then we just left-shift it by first
.
The above should yield correct result, when the following obvious limitation is met:
first + count <= sizeof(UINT)*8 = 32
Note that it should also work correctly for "extreme" cases.
- if
count = 0
we have (1 << count) = 1
, and hence ((1 << count) - 1) = 0
.
- if
count = 32
we have (1 << count) = 0
, since the leading bit overflows, and according to C/C++ rules bitwise shift operators are not cyclic. Then ((1 << count) - 1) = -1
(all bits set).
However, as turned out, for count = 32
the formula doesn't work as expected. As discovered:
UINT n = 32;
UINT x = 1 << n;
// the value of x is 1
Moreover, I'm using MSVC2005 IDE. When I evaluate the above expression in the debugger, the result is 0. However when I step over the above line, x
gets value of 1. Lokking via the disassembler we see the following:
mov eax,1
mov ecx,dword ptr [ebp-0Ch] // ecx = n
shl eax,cl // eax <<= LOBYTE(ecx)
mov dword ptr [ebp-18h],eax // n = ecx
There's no magic indeed, compiler just used shl
instruction. Then it seems that shl
doesn't do what I expected it should do. Either CPU decides to ignore this instruction, or the shift is treated modulo 32, or donno what.
My questions are:
- What is the correct behavior of
shl
/shr
instructions?
- Is there a CPU flag controlling the bitshift instructions?
- Is this according to C/C++ standard?
Thanks in advance
Edit:
Thanks for answers. I've realized that (1) shl
/shr
indeed treat operand modulo 32 (or & 0x1F) and (2) C/C++ standard treats shift by more than 31 bits as undefined behavior.
Then I have one more question. How can I rewrite my "masking" expression to cover this extreme case too. It should be without branching (if
, ?
). What'd be the simplest expression?
1U << 32
is undefined behavior in C and in C++ when type unsigned int
is 32-bit wide.
(C11, 6.5.7p3) "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined"
(C++11, 5.8p1) "The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand."
Shifting by as many or more bits than in the integer type you're shifting is undefined in C and C++. On x86 and x86_64, the shift amount of the shift instructions is indeed treated modulo 32 (or whatever the operand size is). You however cannot rely on this modulo behaviour to be generated by your compiler from C or C++ >>
/<<
operations unless your compiler explicitly guarantees it in its documentation.
I think the expression 1 << 32
is the same as 1 << 0
. IA-32 Instruction Set Reference says that the count operand of shift instructions is masked to 5 bits.
The instruction set reference of IA-32 architectures can be found here.
To fix the "extreme" case, I can only come up with the following code (maybe buggy) that may be a little awkward:
void MaskAddRange(UINT *mask, UINT first, UINT count) {
int count2 = ((count & 0x20) >> 5);
int count1 = count - count2;
*mask |= (((1 << count1) << count2) - 1) << first;
}
The basic idea is to split the shift operation so that each shift count does not exceed 31.
Apparently, the above code assumes that the count is in a range of 0..32, so it is not very robust.
If I have understood the requirements, you want an unsigned int, with the top N bits set?
There are several ways to get the result (I think) you want.
Edit:
I am worried that this isnt very robust, and will fail for n>32:
uint32_t set_top_n(uint32 n)
{
static uint32_t value[33] = { ~0xFFFFFFFF, ~0x7FFFFFFF, ~0x3FFFFFFF, ~0x1FFFFFFF,
~0x0FFFFFFF, ~0x07FFFFFF, ~0x03FFFFFF, ~0x01FFFFFF,
~0x00FFFFFF, ~0x007FFFFF, ~0x003FFFFF, ~0x001FFFFF,
// you get the idea
0xFFFFFFFF
};
return value[n & 0x3f];
}
This should be quite fast as it is only 132 bytes of data.
To make it robust, I'd either extend for all values up to 63, or make it conditional, in which case it can be done with a version of your original bit-masking + the 32 case. I.e.
My 32 cents:
#include <limits.h>
#define INT_BIT (CHAR_BIT * sizeof(int))
unsigned int set_bit_range(unsigned int n, int frm, int cnt)
{
return n | ((~0u >> (INT_BIT - cnt)) << frm);
}
List 1.
A safe version with bogus / semi-circular result could be:
unsigned int set_bit_range(unsigned int n, int f, int c)
{
return n | (~0u >> (c > INT_BIT ? 0 : INT_BIT - c)) << (f % INT_BIT);
}
List 2.
Doing this without branching, or local variables, could be something like;
return n | (~0u >> ((INT_BIT - c) % INT_BIT)) << (f % INT_BIT);
List 3.
List 2 and List 3 This would give "correct" result as long as from
is less then INT_BIT
and >= 0. I.e.:
./bs 1761 26 810
Setting bits from 26 count 810 in 1761 -- of 32 bits
Trying to set bits out of range, set bits from 26 to 836 in 32 sized range
x = ~0u = 1111 1111 1111 1111 1111 1111 1111 1111
Unsafe version:
x = x >> -778 = 0000 0000 0000 0000 0000 0011 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v1 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
Safe version, branching:
x = x >> 0 = 1111 1111 1111 1111 1111 1111 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v2 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
Safe version, modulo:
x = x >> 22 = 0000 0000 0000 0000 0000 0011 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v3 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
You could avoid the undefined behavior by splitting the shift operation in two steps, the first one by (count - 1) bits and the second one by 1 more bit. Special care is needed in case count is zero, however:
void MaskAddRange(UINT& mask, UINT first, UINT count)
{
if (count == 0) return;
mask |= ((1 << (count - 1) << 1) - 1) << first;
}