Convert byte[] with binary data to String

2019-07-04 01:16发布

问题:

I have data in binary format (hex: 80 3b c8 87 0a 89) and I need to convert that into String in order to save binary data in MS Access db via Jackcess. I know, that I'm not suppose to use String in Java for binary data, however Access db is third party product and I have not control whatsoever.

So I tried to convert binary data and save it, but unfortunately the result was unexpected.

byte[] byteArray = new byte[] {0x80, 0x3b, 0xc8, 0x87, 0x0a 0x89};
System.out.println(String.format("%02X ",byteArray[0])+String.format("%02X ", byteArray[1]));//gives me the same values

String value = new String(byteArray, "UTF-8");//or any other encoding
System.out.println(value);//completely different values

I would like to know what going on under new String and if there is a way to convert binary data into String and have the same hex values.

Note 1: initially I read a binary file which has nothing to do with hex. I use hex just for comparison of datasets.

Note 2 There was a suggestion to use Base64 aka MIME, UTF-7, etc. By my understanding, it takes binary data and encodes that into ANSI charset, basically tweaking initial data. However,for me that is not a solution, because I must write exact data that I hold in binary array.

byte[] byteArray = new byte[]{0x2f, 0x7a, 0x2d, 0x28};
byte[]   bytesEncoded = Base64.encodeBase64(byteArray);
System.out.println("encoded value is " + new String(bytesEncoded ));//new data

回答1:

In order to safely convert arbitrary binary data into text, you should use something like hex or base64. Encodings such as UTF-8 are meant to encode arbitrary text data as bytes, not to encode arbitrary binary data as text. It's a difference in terms of what the source data is.

I would strongly recommend using a library for this. For example, with Guava:

String hex = BaseEncoding.base16().encode(byteArray);
// Store hex in the database in the text field...
...
// Get hex from the database from the text field...
byte[] binary = BaseEncoding.base16().decode(hex);

(Other libraries are available, of course, such as Apache Commons Codec.)

Alternatively, save your binary data into a field in Access which is designed for binary data, instead of converting it to text at all.



回答2:

The basic lesson to be taken - never mix up binary data with String equivalent.

My mistake was, that I exported initial data from Access into csv, while changing type of the index field from binary to String (total mess, now I know). The solution that I came - my own export tool from Access, where all data is kept as binary. Thanks to @gord-thompson - his comment led to the solution.



标签: java jackcess