I'm transforming bigints into binary, radix16 and radix64 encoding and seeing mysterious msb zero paddings. Is this a biginteger problem that I can workaround by stripping zero padding or perhaps doing something else?
My test code:
String s;
System.out.printf( "%s length %d\n", s = "123456789A", (new BigInteger( s, 16 )).toByteArray().length );
System.out.printf( "%s length %d\n", s = "F23456789A", (new BigInteger( s, 16 )).toByteArray().length );
Produces output:
123456789A length 5
F23456789A length 6
Of which the longer array has zero padding at the front. Upon inspection of BigInteger.toByteArray() I see:
public byte[] toByteArray() {
int byteLen = bitLength()/8 + 1;
byte[] byteArray = new byte[byteLen];
Now, I can find private int bitLength;
, but I can't quite find where bitLength() is defined to figure out exactly why this class does this - connected to sign extension perhaps?
Thanks Jon Skeet for your answer. Here's some code I'm using to convert, very likely it can be optimized.
import java.math.BigInteger;
import java.util.Arrays;
public class UnsignedBigInteger {
public static byte[] toUnsignedByteArray(BigInteger value) {
byte[] signedValue = value.toByteArray();
if(signedValue[0] != 0x00) {
throw new IllegalArgumentException("value must be a psoitive BigInteger");
}
return Arrays.copyOfRange(signedValue, 1, signedValue.length);
}
public static BigInteger fromUnsignedByteArray(byte[] value) {
byte[] signedValue = new byte[value.length + 1];
System.arraycopy(value, 0, signedValue, 1, value.length);
return new BigInteger(signedValue);
}
}
Yes, this is the documented behaviour:
The byte array will be in big-endian byte-order: the most significant byte is in the zeroth element. The array will contain the minimum number of bytes required to represent this BigInteger, including at least one sign bit, which is (ceil((this.bitLength() + 1)/8))
.
bitLength()
is documented as:
Returns the number of bits in the minimal two's-complement representation of this BigInteger
, excluding a sign bit.
So in other words, two values with the same magnitude will always have the same bit length, regardless of sign. Think of a BigInteger
as being an unsigned integer and a sign bit - and toByteArray()
returns all the data from both parts, which is "the number of bits required for the unsigned integer, and one bit for the sign".