Java Comparator for byte array (lexicographic)

I have a hashmap with byte[] keys. I'd like to sort it through a TreeMap.

What is the most effective way to implement the comparator for lexicographic order?

标签： java sorting collections map compare

4条回答

2楼-- · 2019-01-23 16:25

I'm assuming the problem is just with the "byte vs. byte" comparison. Dealing with the arrays is straightforward, so I won't cover it. With respect to byte vs. byte, my first thought is to do this:

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    return new Byte(b1).compareTo(b2);
  }
}

But that won't be lexicographic: 0xFF (the signed byte for -1) will be considered smaller than 0x00, when lexicographically it's bigger. I think this should do the trick:

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    // convert to unsigned bytes (0 to 255) before comparing them.
    int i1 = b1 < 0 ? 256 + b1 : b1;
    int i2 = b2 < 0 ? 256 + b2 : b2;
    return i2 - i1;
  }
}

Probably there is something in Apache's commons-lang or commons-math libraries that does this, but I don't know it off hand.

0人赞添加讨论(0) 举报

Explosion°爆炸

3楼-- · 2019-01-23 16:40

Using Guava, you can use either of:

The UnsignedBytes comparator appears to have an optimized form using Unsafe that it uses if it can. Comments in the code indicate that it may be at least twice as fast as a normal Java implementation.

0人赞添加讨论(0) 举报

Luminary・发光体

4楼-- · 2019-01-23 16:44

You can use a comparator which comares the Character.toLowerCase() of each of the bytes in the array (Assuming the byte[] is in ASCII) if not you will need to do the character decoding yourself or use new String(bytes, charSet).toLowerCase() but this is not likely to be efficient.

0人赞添加讨论(0) 举报

beautiful°

5楼-- · 2019-01-23 16:48

Found this nice piece of code in Apache Hbase:

    public int compare(byte[] left, byte[] right) {
        for (int i = 0, j = 0; i < left.length && j < right.length; i++, j++) {
            int a = (left[i] & 0xff);
            int b = (right[j] & 0xff);
            if (a != b) {
                return a - b;
            }
        }
        return left.length - right.length;
    }

0人赞添加讨论(0) 举报

Java Comparator for byte array (lexicographic)

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间