What are the non-hex characters in HBase Shell Row

2020-02-28 17:42发布

问题:

I am saving my key as a byte-array. In HBase Shell when I look at my key I see non-hex values...I do not have any encoding enabled, I do not have any compression enabled.

Here is a sample...what is VNQ? what is BBW? I'm guessing there is some sort of encoding going on?

\xFB\xC6\xE8\x03\xF0VNQ\x8By\xF6\x89D\xC1\xBBW\x00\x00\x00\x00\x00\x00\x01\xF3\x00\x00\x00\x00\x00\x07\xA1\x1F

回答1:

HBase shell uses something called a "binary string" (Escaped hexadecimal) representation of byte arrays to print out the keys/values (See Bytes.toStringBinary method). This method basically does one of the two things to every byte:

  1. Convert it to a printable (ASCII) representation if the byte value is within range.
  2. Convert it to \xHH (where 'H' represents a Hex digit) if the byte value is not within the ASCII range.

The idea is to use a printable representation. If your keys/values were all printable characters, then the shell would not print out any of those weird \xHH sequences.

If you prefer Hex representation instead, try the following in HBase shell:

> import org.apache.hadoop.hbase.util.Bytes
> Bytes.toHex(Bytes.toBytesBinary("\xFB\xC6\xE8\x03\xF0VNQ"))
> fbc6e803f0564e51

You can modify hbase shell ruby wrappers to use the toHex() method instead of the toStringBinary() to print out data (or better; you can contribute a patch to HBase to include a flag for the two choices if you feel like it; see HBase developer guide).