I have some CLOB columns in a database that I need to put Base64 encoded binary files in. These files can be large, so I need to stream them, I can't read the whole thing in at once.
I'm using org.apache.commons.codec.binary.Base64InputStream
to do the encoding, and I'm running into a problem. My code is essentially this
FileInputStream fis = new FileInputStream(file);
Base64InputStream b64is = new Base64InputStream(fis, true, -1, null);
BufferedReader reader = new BufferedReader(new InputStreamReader(b64is));
preparedStatement.setCharacterStream(1, reader);
When I run the above code, I get one of these during the execution of the update
java.io.IOException: Underlying input stream returned zero bytes
, it is thrown deep in the InputStreamReader code.
Why would this not work? It seems to me like the reader
would attempt to read from the base 64 stream, which would read from the file stream, and everything should be happy.
This appears to be a bug in
Base64InputStream
. You're calling it correctly.You should report this to the Apache commons codec project.
Simple test case:
the
read(byte[])
call ofInputStream
is not allowed to return 0. It does return 0 on any file which is a multiple of 3 bytes long."For top efficiency, consider wrapping an
InputStreamReader
within aBufferedReader
. For example:"Addendum: As
Base64
is padded to a multiple of 4 characters, verify that the source isn't truncated. Aflush()
may be required.Interesting, I did some tests here and it indeed throws that exception when you read the
Base64InputStream
using anInputStreamReader
, regardless the source of the stream, but it works flawlessly when you read it as binary stream. As Trashgod mentioned, Base64 encoding is framed. TheInputStreamReader
should in fact have invokedflush()
on theBase64InputStream
once more to see if it doesn't return any more data.I don't see other ways to fix this than implementing your own. This is actually a bug, see Keith's answer.Base64InputStreamReader
orBase64Reader
As a workaround you can also just store it in a BLOB instead of a CLOB in the DB and use
PreparedStatement#setBinaryStream()
instead. It doesn't matter if it's stored as binary data or not. You don't want to have such large Base64 data to be indexable or searchable anyway.Update: since that's not an option and having the Apache Commons Codec guys to fix the
Base64InputStream
bug which I repored as CODEC-101 might take some time, you may consider to use another 3rd party Base64 API. I've found one here (public domain, so you can do whatever with it you want, even place in your own package), I've tested it here and it works fine.Update 2: the commons codec guy has fixed it pretty soon.
I tried it here and it works fine.