I don't understand the output from the following code:
public static void main(String[] args) {
int i1, i2, i3, i4;
byte b;
i1 = 128;
b = (byte) i1;
i2 = (int) b;
i3 = 0 | b;
i4 = 1 << 7;
System.out.format("i1: %d b: %d i2: %d i3: %d i4: %d\n", i1, b, i2, i3, i4);
}
Output:
i1: 128 b: -128 i2: -128 i3: -128 i4: 128
Because byte
is an 8-bit two's-complement signed integer, the binary representations with a 1 in the most significant bit are interpreted as negative values, which is why b
becomes -128
, which I'm totally fine with. I also understand that it's probably a good idea to keep the interpretation consistent when casting, as with i2
. But shouldn't i3
and i4
have identical bit patterns and therefore map to identical int
values?
Sign extension is what is making i2
and i3
negative.
In the expression (0 | b)
, b
is promoted to an int, and sign extension occurs as part of this promotion.
That's not happening in the expression assigned to i4
. The constants 1
and 7
are already ints so there's no sign extension involved.
In this line:
i3 = 0 | b;
The "b" variable is automatically promoted to int
type with sign extension because of the |
operator, so becomes (int)-128
, i.e. 0xffffff80
.
When "or"ed with zero, its still the same value, namely -128
No, i4
is not a byte value, it's an int
. That means that its sign bit is bit 31, not bit 7.
UPDATE: i3
is an int
too, but it is initialized by extending a byte
, so it keeps the sign from the byte
value.
i2 = (int) b;
i3 = 0 | b;
the i3 statement is equivalent to:
i3 = 0 | ((int) b) = 0 | i2
so naturally it is going to have the same value as i2
That's simple. i3 = 0 | b;
gets evaluated like byte
, then it is converted to int
. Whereas i4 = 1 << 7;
will evaluate value as int
and assign it to int
. So in the first case we get 10000000b
cast to int
from byte
, which will give us -128
. And in the second we just assign this value to int
without cast, which gives us 128
.
If you want to get the unsigned value of the bit pattern in a byte:
b & 0xff
See also this old answer.
In
i3 = 0 | b;
I'm guessing the 0 | b part is evaluated as a byte, and the results are then cast to an int, while in
i4 = 1 << 7;
the 1 << 7 part is already an int.
The above guess has been pointed out in the comments to be wrong!
The correct version is: In the top expression, the b is already cast to an int with sign extension before the OR operation.