Bit Twiddling in C - Counting Bits

2019-02-20 17:33发布

问题:

I want to count the bits that are set in an extremely large bit-vector (i.e. 100,000 bits).

What I am currently doing is using a pointer to char (i.e. char *cPtr) to point to the beginning of the array of bits. I then:

1. look at each element of the array (i.e. cPtr[x]),   
2. convert it to an integer (i.e. (int) cPtr[x])   
3. use a 256 element look-up table to see how many bits are set in the given byte (i.e. cPtr[x]). 

It occurs to me that if I use a short int pointer instead (i.e. short int * sPtr), then I will only need half as many look-ups, but with a 65534 element look-up table, which will have its own cost in memory usage.

I'm wondering what is the optimal number of bits to examine each time. Also, if that number is not the size of some preset type, how can I walk down my bit-vector and set a pointer to be ANY arbitrary number of bits past the starting location of the bit array.

I know there are other ways to count bits, but for now I want to be certain I can optimize this method before making comparisons to other methods.

回答1:

I'm wondering what is the optimal number of bits to examine each time

The only way to find out is to test. See this question for a discussion of the fastest way to count 32 bits at a time.

Also, if that number is not the size of some preset type, how can I walk down my bit-vector and set a pointer to be ANY arbitrary number of bits past the starting location of the bit array.

You can't set a pointer to an arbitrary bit. Most machines have byte-addressing, some can only address words.

You can construct a word starting with an arbitrary bit like so:

long wordAtBit(int32_t* array, size_t bit)
{
    size_t idx = bit>>5;
    long word = array[idx] >> (bit&31);
    return word | (array[idx+1] << (32 - (bit&31));
}


回答2:

You can count it using bitwise operation:

char c = cPtr[x];
int num = ((c & 0x01) >> 0) +
          ((c & 0x02) >> 1) +
          ((c & 0x04) >> 2) +
          ((c & 0x08) >> 3) +
          ((c & 0x10) >> 4) +
          ((c & 0x20) >> 5) +
          ((c & 0x40) >> 6) +
          ((c & 0x80) >> 7);

It might seem a little long, but it doesn't require accessing many time to memory, so after all it seems pretty cheap for me.

You can even make it cheaper by reading an int every time, but then you will probably need to address an alignment issue.



回答3:

This should be quite fast (taken from Wikipedia):

static unsigned char wordbits[65536] = { bitcounts of ints between 0 and 65535 };
static int popcount(uint32 i)
{
    return (wordbits[i&0xFFFF] + wordbits[i>>16]);
}

In this way, you can check 32 bits per iteration.



回答4:

I am a bit late to the party, but there are much faster approaches than the ones that have been suggested so far. The reason is that many modern architectures offer hardware instructions to count the number of bits in various ways (leading zeroes, leading ones, trailing zeroes or ones, counting the number of bits set to 1, etc...). Counting the number of bits set to 1 is called the Hamming weight, also commonly called population count, or just popcount.

As a matter of fact, x86 CPUs have a POPCNT instruction as part of the SSE4.2 instruction set. The very latest latest CPU architecture from Intel (nicknamed Haswell) offer even more hardware support for bit manipulation with the BMI1 and BMI2 extensions - maybe there is something else to use there!