Question
Are the Java 8 java.util.Base64
MIME Encoder and Decoder a drop-in replacement for the unsupported, internal Java API sun.misc.BASE64Encoder
and sun.misc.BASE64Decoder
?
What I think so far and why
Based on my investigation and quick tests (see code below) it should be a drop-in replacement because
sun.misc.BASE64Encoder
based on its JavaDoc is a BASE64 Character encoder as specified in RFC1521. This RFC is part of the MIME specification...
java.util.Base64
based on its JavaDoc Uses the "The Base64 Alphabet" as specified in Table 1 of RFC 2045 for encoding and decoding operation... under MIME
Assuming no significant changes in the RFC 1521 and 2045 (I could not find any) and based on my quick test using the Java 8 Base64 MIME Encoder/Decoder should be fine.
What I am looking for
- an authoritative source confirming or disproving the "drop-in replacement" point OR
- a counterexample which shows a case where java.util.Base64 has different behaviour than the sun.misc.BASE64Encoder OpenJDK Java 8 implementation (8u40-b25) (BASE64Decoder) OR
- whatever you think answers above question definitely
For reference
My test code
public class Base64EncodingDecodingRoundTripTest {
public static void main(String[] args) throws IOException {
String test1 = " ~!@#$%^& *()_+=`| }{[]\\;: \"?><,./ ";
String test2 = test1 + test1;
encodeDecode(test1);
encodeDecode(test2);
}
static void encodeDecode(final String testInputString) throws IOException {
sun.misc.BASE64Encoder unsupportedEncoder = new sun.misc.BASE64Encoder();
sun.misc.BASE64Decoder unsupportedDecoder = new sun.misc.BASE64Decoder();
Base64.Encoder mimeEncoder = java.util.Base64.getMimeEncoder();
Base64.Decoder mimeDecoder = java.util.Base64.getMimeDecoder();
String sunEncoded = unsupportedEncoder.encode(testInputString.getBytes());
System.out.println("sun.misc encoded: " + sunEncoded);
String mimeEncoded = mimeEncoder.encodeToString(testInputString.getBytes());
System.out.println("Java 8 Base64 MIME encoded: " + mimeEncoded);
byte[] mimeDecoded = mimeDecoder.decode(sunEncoded);
String mimeDecodedString = new String(mimeDecoded, Charset.forName("UTF-8"));
byte[] sunDecoded = unsupportedDecoder.decodeBuffer(mimeEncoded); // throws IOException
String sunDecodedString = new String(sunDecoded, Charset.forName("UTF-8"));
System.out.println(String.format("sun.misc decoded: %s | Java 8 Base64 decoded: %s", sunDecodedString, mimeDecodedString));
System.out.println("Decoded results are both equal: " + Objects.equals(sunDecodedString, mimeDecodedString));
System.out.println("Mime decoded result is equal to test input string: " + Objects.equals(testInputString, mimeDecodedString));
System.out.println("\n");
}
}
Here's a small test program that illustrates a difference in the encoded strings:
byte[] bytes = new byte[57];
String enc1 = new sun.misc.BASE64Encoder().encode(bytes);
String enc2 = new String(java.util.Base64.getMimeEncoder().encode(bytes),
StandardCharsets.UTF_8);
System.out.println("enc1 = <" + enc1 + ">");
System.out.println("enc2 = <" + enc2 + ">");
System.out.println(enc1.equals(enc2));
Its output is:
enc1 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>
enc2 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA>
false
Note that the encoded output of sun.misc.BASE64Encoder
has a newline at the end. It doesn't always append a newline, but it happens to do so if the encoded string has exactly 76 characters on its last line. (The author of java.util.Base64
considered this to be a small bug in the sun.misc.BASE64Encoder
implementation – see the review thread).
This might seem like a triviality, but if you had a program that relied on this specific behavior, switching encoders might result in malformed output. Therefore, I conclude that java.util.Base64
is not a drop-in replacement for sun.misc.BASE64Encoder
.
Of course, the intent of java.util.Base64
is that it's a functionally equivalent, RFC-conformant, high-performance, fully supported and specified replacement that's intended to support migration of code away from sun.misc.BASE64Encoder
. You need to be aware of some edge cases like this when migrating, though.
I had same issue, when i moved from sun to java.util.base64, but org.apache.commons.codec.binary.Base64 this solved my problem
There are no changes to the base64 specification between rfc1521 and rfc2045.
All base64 implementations could be considered to be drop-in replacements of one another, the only differences between base64 implementations are:
- the alphabet used.
- the API's provided (e.g. some might take only act on a full input buffer, while others might be finite state machines allowing you to continue to push chunks of input through them until you are done).
The MIME base64 alphabet has remained constant between RFC versions (it has to or older software would break) and is: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/
As Wikipedia notes, only the last 2 characters may change between base64 implementations.
As an example of a base64 implementation that does change the last 2 characters, the IMAP MUTF-7 specification uses the following base64 alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+,
The reason for the change is that the /
character is often used as a path delimiter and since the MUTF-7 encoding is used to flatten non-ASCII directory paths into ASCII, the /
character needed to be avoided in encoded segments.
Assuming both encoders are bug free, then the RFC requires distinct encodings for every 0 byte, 1 byte, 2 byte and 3 bytes sequence. Longer sequences are broken down into as many 3 byte sequences as needed followed by a final sequence. Hence if the two implementations handle all 16,843,009 (1+256+65536+16777216) possible sequences correctly, then the two implementations are also identical.
These tests only take a few minutes to run. By slightly changing your test code, I have done that and my Java 8 installation passed all the test. Hence the public implementation can be used to safely replace the sun.misc implementation.
Here is my test code:
import java.util.Base64;
import java.util.Arrays;
import java.io.IOException;
public class Base64EncodingDecodingRoundTripTest {
public static void main(String[] args) throws IOException {
System.out.println("Testing zero byte encoding");
encodeDecode(new byte[0]);
System.out.println("Testing single byte encodings");
byte[] test = new byte[1];
for(int i=0;i<256;i++) {
test[0] = (byte) i;
encodeDecode(test);
}
System.out.println("Testing double byte encodings");
test = new byte[2];
for(int i=0;i<65536;i++) {
test[0] = (byte) i;
test[1] = (byte) (i >>> 8);
encodeDecode(test);
}
System.out.println("Testing triple byte encodings");
test = new byte[3];
for(int i=0;i<16777216;i++) {
test[0] = (byte) i;
test[1] = (byte) (i >>> 8);
test[2] = (byte) (i >>> 16);
encodeDecode(test);
}
System.out.println("All tests passed");
}
static void encodeDecode(final byte[] testInput) throws IOException {
sun.misc.BASE64Encoder unsupportedEncoder = new sun.misc.BASE64Encoder();
sun.misc.BASE64Decoder unsupportedDecoder = new sun.misc.BASE64Decoder();
Base64.Encoder mimeEncoder = java.util.Base64.getMimeEncoder();
Base64.Decoder mimeDecoder = java.util.Base64.getMimeDecoder();
String sunEncoded = unsupportedEncoder.encode(testInput);
String mimeEncoded = mimeEncoder.encodeToString(testInput);
// check encodings equal
if( ! sunEncoded.equals(mimeEncoded) ) {
throw new IOException("Input "+Arrays.toString(testInput)+" produced different encodings (sun=\""+sunEncoded+"\", mime=\""+mimeEncoded+"\")");
}
// Check cross decodes are equal. Note encoded forms are identical
byte[] mimeDecoded = mimeDecoder.decode(sunEncoded);
byte[] sunDecoded = unsupportedDecoder.decodeBuffer(mimeEncoded); // throws IOException
if(! Arrays.equals(mimeDecoded,sunDecoded) ) {
throw new IOException("Input "+Arrays.toString(testInput)+" was encoded as \""+sunEncoded+"\", but decoded as sun="+Arrays.toString(sunDecoded)+" and mime="+Arrays.toString(mimeDecoded));
}
}
}