I'm looking to convert a Java char array to a byte array without creating an intermediate String
, as the char array contains a password. I've looked up a couple of methods, but they all seem to fail:
char[] password = "password".toCharArray();
byte[] passwordBytes1 = new byte[password.length*2];
ByteBuffer.wrap(passwordBytes1).asCharBuffer().put(password);
byte[] passwordBytes2 = new byte[password.length*2];
for(int i=0; i<password.length; i++) {
passwordBytes2[2*i] = (byte) ((password[i]&0xFF00)>>8);
passwordBytes2[2*i+1] = (byte) (password[i]&0x00FF);
}
String passwordAsString = new String(password);
String passwordBytes1AsString = new String(passwordBytes1);
String passwordBytes2AsString = new String(passwordBytes2);
System.out.println(passwordAsString);
System.out.println(passwordBytes1AsString);
System.out.println(passwordBytes2AsString);
assertTrue(passwordAsString.equals(passwordBytes1) || passwordAsString.equals(passwordBytes2));
The assertion always fails (and, critically, when the code is used in production, the password is rejected), yet the print statements print out password three times. Why are passwordBytes1AsString
and passwordBytes2AsString
different from passwordAsString
, yet appear identical? Am I missing out a null terminator or something? What can I do to make the conversion and unconversion work?
When you use GetBytes From a String in Java, The return result will depend on the default encode of your computer setting.(eg: StandardCharsetsUTF-8 or StandardCharsets.ISO_8859_1etc...).
So, whenever you want to getBytes from a String Object. Make sure to give a encode . like :
Let check what has happened with the code. In java, the String named sample , is stored by Unicode. every char in String stored by 2 byte.
But, When we getBytes From a String, we have
In order to get the oringle byte of the String. We can just read the Memory of the string and get Each byte of the String.Below is the sample Code:
Usages:
Conversion between char and byte is character set encoding and decoding.I prefer to make it as clear as possible in code. It doesn't really mean extra code volume:
Aside:
java.nio classes and java.io Reader/Writer classes use ByteBuffer & CharBuffer (which use byte[] and char[] as backing arrays). So often preferable if you use these classes directly. However, you can always do:
Original Answer
Edited to use StandardCharsets
Here is a JavaDoc page for StandardCharsets. Note this on the JavaDoc page:
If you want to use a ByteBuffer and CharBuffer, don't do the simple
.asCharBuffer()
, which simply does an UTF-16 (LE or BE, depending on your system - you can set the byte-order with theorder
method) conversion (since the Java Strings and thus yourchar[]
internally uses this encoding).Use
Charset.forName(charsetName)
, and then itsencode
ordecode
method, or thenewEncoder
/newDecoder
.When converting your byte[] to String, you also should indicate the encoding (and it should be the same one).
This is an extension to Peter Lawrey's answer. In order to backward (bytes-to-chars) conversion work correctly for the whole range of chars, the code should be as follows:
We need to "unsign" bytes before using (
& 0xff
). Otherwise half of the all possible char values will not get back correctly. For instance, chars within[0x80..0xff]
range will be affected.The problem is your use of the
String(byte[])
constructor, which uses the platform default encoding. That's almost never what you should be doing - if you pass in "UTF-16" as the character encoding to work, your tests will probably pass. Currently I suspect thatpasswordBytes1AsString
andpasswordBytes2AsString
are each 16 characters long, with every other character being U+0000.