How does this bit manipulation work in Java?

2020-07-09 02:58发布

I was looking into how Java counts bits set of an int.
I had in my mind something simple like this (which I think is correct):

public static int bitCount(int number){  
        final int MASK = 0x1;  
        int count = 0;  

        for(int i = 0; i < 32; i++){  
            if(((number >>> i) & MASK) == MASK){  
                count++;  
            }  
        }  
        return count;  
    }  

Instead I found a method that I absolutely have no idea what is doing (seems like magic to me):

 i = i - ((i >>> 1) & 0x55555555);  
 i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);  
 i = (i + (i >>> 4)) & 0x0f0f0f0f;  
 i = i + (i >>> 8);  
 i = i + (i >>> 16);  
 return i & 0x3f;  

Could someone help understand what this does and why a simple function like the one I came up as first thought is/could be bad?

2条回答
时光不老,我们不散
2楼-- · 2020-07-09 03:37

Sure

i = i - ((i >>> 1) & 0x55555555);

The bit pattern of 5 is 0101 (four bits), so the mask is a repetition of the bit pattern 01 sixteen times. This line counts the number of bits in every two-bit pair of the number. If you consider two-bit pairs, (i >>> 1) & 0x1 gets the high-order bit in the low-order position. Now, there are four possible two-bit patterns

00 ~> 00 - 00 = 00
01 ~> 01 - 00 = 01
10 ~> 10 - 01 = 01
11 ~> 11 - 01 = 10

and each two-bit pair now contains the number of bits the original had in those positions.

i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);

Next, we count the bits that were in each four-bit group (aka nibble). By masking a nibble with 0x3 = 0011(b), we get the count of bits that were in the lower order two bits of the nibble, and (i >>> 2) & 0x3 gets the count from the higher order two bits of the nibble. These counts are now added. Since each count was at most 2, the sum is at most 4, hence doesn't leave the nibble.

i = (i + (i >>> 4)) & 0x0f0f0f0f;

Now the bits in each octet are counted. Each nibble contains the count of bits set in the original in that place. Shifting right by four bits moves the count from the higher order nibble in each place to the lower order nibble, these are added. Then we also moved the count from the lower order nibble to the higher order nible of the adjacent octet, that is masked out by the & 0x0f0f0f0f. Since each octet can have at most eight bits set, the count doesn't leave the lower order nibble of the octet.

i = i + (i >>> 8);

Now we add the counts of the pairs of adjacent octets.

i = i + (i >>> 16);

Now we add the totals of the counts in the high order two octets and the low order two.

return i & 0x3f;

Finally, all octets except the lowest order octet are masked out, since the higher order octets still contained intermediate counts.

The reason why your simple implementation might be considered bad is performance. The convoluted bit-hack uses fewer operations and no branches, so it is faster. It is, however, far easier to get wrong.

Another neat way to count the set bits in an int (or long) is

public static int bitCount(int n) {
    int count = 0;
    while(n != 0) {
        ++count;
        n = n & (n-1);
    }
    return count;
}

That works because n = n & (n-1) clears the last set bit in n and leaves everything else untouched. If the bit pattern of n ends with

...100...0

the bit pattern of n-1 is

...011...1

and the bits before the last 1-bit in n are the same. In Java, that is guaranteed to work also for negative numbers because integer overflow has wrap-around semantics, so if n = Integer.MIN_VALUE, the bit pattern is 100...0 and n - 1 becomes Integer.MAX_VALUE with the bit pattern 011...1.

查看更多
闹够了就滚
3楼-- · 2020-07-09 03:48

the cool method only works since the computer has a finite range of values for int. it won't work (and so other cool bits algorithms) so easily for infinite range (like BigInteger) since you need to know all of the needed masks before the calculation .

in any case , you can read how it works via : http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

it's at the bottom of this chapter.

查看更多
登录 后发表回答