How to find number of 1's in a binary number i

I know this has been asked before, but I'm looking at this particular solution listed here:

int BitCount(unsigned int u)
{
     unsigned int uCount;

     uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
     return ((uCount + (uCount >> 3)) & 030707070707) % 63;
}

How does it work?

Are there any caveats involved here?

Theoretically is it possible to find the answer in constant time? I mean don't we actually have to iterate through the bits to count?

标签： c algorithm bit-manipulation

4条回答

可以哭但决不认输i

2楼-- · 2020-06-23 05:52

Iterating the bits is constant time since the number of bits in a type is constant.

So a solution that checks a mask of one bit and shifts for each bit in the target value is indeed O(1) (e.g. where the constant is 32).

0人赞添加讨论(0) 举报

We Are One

3楼-- · 2020-06-23 06:05

Counting bits set, in parallel shows how this is done. That method can be used for 8-, 16-, 32-, 64-, 128-, etc. bit words, although the constants used in the calculations change.

When we say that this operation is O(1), we mean that it can be done in constant time regardless of the word size. Counting the bits the naive way is O(n) in the number of bits.

Speaking practically, this is only O(1) when the processor can work with the word size natively.

As for how it works, it uses "magic numbers." See this newsgroup post for an explanation.

0人赞添加讨论(0) 举报

混吃等死

4楼-- · 2020-06-23 06:15

Counting bits

An unsigned 32-bit integer u can be written like this:

u = a₃₁ * 2³¹ + a₃₀ * 2³⁰ + ... + a₀ * 2⁰

We want the value of a₃₁ + a₃₀ + ... + a₀.

Let's compare the values of u >> k:

u >> 0  = a₃₁ * 2³¹ + a₃₀ * 2³⁰ + ... + a₁ * 2¹ + a₀ * 2⁰
u >> 1  = a₃₁ * 2³⁰ + a₃₀ * 2²⁹ + ... + a₁ * 2⁰
u >> 2  = a₃₁ * 2²⁹ + a₃₀ * 2²⁸ + ...
...
u >> 29 = a₃₁ * 2² + a₂₉ * 2¹ + ...
u >> 30 = a₃₁ * 2¹ + a₃₀ * 2⁰
u >> 31 = a₃₁ * 2⁰

We will calculate the bit population by this formula:

u >> 0 - u >> 1 - u >> 2 - ... - u >> 31 = p

Let's see why this works:

  u >> 0 - u >> 1 - u >> 2 - ... - u >> 31
= u >> 0 - (u >> 1 + u >> 2 + ... + u >> 31)
= u - q

What's the value of q? Let's calculate it bit by bit, looking at the values for u >> k above. For a₃₁, it's:

  a₃₁ * 2³⁰ + a₃₁ * 2²⁹ + ...
= a₃₁ * (2³⁰ + 2²⁹ + ...)
= a₃₁ * (2³¹ - 1)

Or for a₃₀:

  a₃₀ * 2²⁹ + a₃₀ * 2²⁸ + ...
= a₃₀ * (2²⁹ + 2²⁸ + ...)
= a₃₀ * (2³⁰ - 1)

We find: q = a₃₁ * (2³¹ - 1) + a₃₀ * (2³⁰ - 1) + ...

And thus

u - q = a₃₁ * 2³¹ - a₃₁ * (2³¹ - 1) + ...
      = a₃₁ + a₃₀ + ... + a₀

Counting bits in 3-bit blocks

This algorithm starts by doing the same thing, but in blocks of 3 bits:

u >> 0                = AaBbbCccDddEeeFffGggHhhIiiJjjKkk (each letter is a bit)
u >> 1 & 033333333333 =  A Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk (blank = zero)
u >> 2 & 011111111111 =     B  C  D  E  F  G  H  I  J  K

Taking the difference of these, by the above algorithm, each octet in uCount contains the number of bits set in the corresponding octet in u.

uCount      =   αβγδεζηθικλ (each greek letter is an octet)
uCount >> 3 =    αβγδεζηθικ

So uCount + (uCount >> 3) is (λ+κ) * 2⁰ + (κ+ι) * 2³ + (ι+θ) * 2⁶ + ...

By ANDing with 0o30707070707, we mask away every other octet, so that we only count each pair once:

r = (λ+κ) *  2⁰ + (ι+θ) *  2⁶ + (η+ζ) *  2¹² + ...
  = (λ+κ) * 64⁰ + (ι+θ) * 64¹ + (η+ζ) * 64² + ...

This is a base-64 number, and we want to sum up the base-64 digits to get α+β+γ+δ+ε+ζ+η+θ+ι+κ+λ, our final result. To do this, we calculate its base-64 digital root: knowing that the result can never be bigger than 32, we just modulo the number by 63.

0人赞添加讨论(0) 举报

冷血范

5楼-- · 2020-06-23 06:15

The fastest way to do this is popcnt instruction. You can access it through compiler intrinsic usually. Your solution can be useful on platforms that lack this instruction.

0人赞添加讨论(0) 举报