I understand the >>>
fixes the overflow: when adding two big positive longs you may endup with a negative number. Can someone explain how this bitwise shift magically fixes the overflow problem? And how it is different than >>
?
My suspicious: I think it has to do with the fact that Java uses two-compliments so the overflow is the right number if we had the extra space but because we don't it becomes negative. So when you shift and paddle with zero it magically gets fixed due to the two-compliments. But I can be wrong and someone with a bitwise brain has to confirm. :)
In short, (high + low) >>> 1
is a trick that uses the unused sign-bit to perform a correct average of non-negative numbers.
Under the assumption that high
and low
are both non-negative, we know for sure that the upper-most bit (the sign-bit) is zero.
So both high
and low
are in fact 31-bit integers.
high = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824
low = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824
When you add them together they may "spill" over into the top-bit.
high + low = 1000 0000 0000 0000 0000 0000 0000 0000
= 2147483648 as unsigned 32-bit integer
= -2147483648 as signed 32-bit integer
(high + low) / 2 = 1100 0000 0000 0000 0000 0000 0000 0000 = -1073741824
(high + low) >>> 1 = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824
As a signed 32-bit integer, it is overflow and flips negative. Therefore (high + low) / 2
is wrong because high + low
could be negative.
As unsigned 32-bit integers, the sum is correct. All that's needed is to divide it by 2.
Of course Java doesn't support unsigned integers, so the best thing we have to divide by 2 (as an unsigned integer) is the logical right-shift >>>
.
In languages with unsigned integers (such as C and C++), it gets trickier since your input can be full 32-bit integers. One solution is: low + ((high - low) / 2)
Finally to enumerate the differences between >>>
, >>
, and /
:
>>>
is logical right-shift. It fills the upper bits with zero.
>>
is arithmetic right-shift. It fills the upper its with copies of the original top bit.
/
is division.
Mathematically:
x >>> 1
treats x
as an unsigned integer and divides it by two. It rounds down.
x >> 1
treats x
as a signed integer and divides it by two. It rounds towards negative infinity.
x / 2
treats x
as a signed integer and divides it by two. It rounds towards zero.
It zero-fills the topmost bits instead of sign-filling them.
int a = 0x40000000;
(a + a) / 2 == 0xC0000000;
(a + a) >>> 1 == 0x40000000;
I'd suggest to read Joch Bloch's http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html#!/2006/06/extra-extra-read-all-about-it-nearly.html about high and low
"The version of binary search that I wrote for the JDK contained the same bug. It was reported to Sun recently when it broke someone's
program, after lying in wait for nine years or so."