I have a hashmap with byte[] keys. I'd like to sort it through a TreeMap.
What is the most effective way to implement the comparator for lexicographic order?
I have a hashmap with byte[] keys. I'd like to sort it through a TreeMap.
What is the most effective way to implement the comparator for lexicographic order?
I'm assuming the problem is just with the "byte vs. byte" comparison. Dealing with the arrays is straightforward, so I won't cover it. With respect to byte vs. byte, my first thought is to do this:
But that won't be lexicographic: 0xFF (the signed byte for -1) will be considered smaller than 0x00, when lexicographically it's bigger. I think this should do the trick:
Probably there is something in Apache's commons-lang or commons-math libraries that does this, but I don't know it off hand.
Using Guava, you can use either of:
The
UnsignedBytes
comparator appears to have an optimized form usingUnsafe
that it uses if it can. Comments in the code indicate that it may be at least twice as fast as a normal Java implementation.You can use a comparator which comares the Character.toLowerCase() of each of the bytes in the array (Assuming the byte[] is in ASCII) if not you will need to do the character decoding yourself or use
new String(bytes, charSet).toLowerCase()
but this is not likely to be efficient.Found this nice piece of code in Apache Hbase: