I'm working through the problems in Programming Pearls, 2nd edition, Column 1. One of the problems involves writing a program that uses only around 1 megabyte of memory to store the contents of a file as a bit array with each bit representing whether or not a 7 digit number is present in the file. Since Java is the language I'm the most familiar with, I've decided to use it even though the author seems to have had C and C++ in mind.
Since I'm pretending memory is limited for the purpose of the problem I'm working on, I'd like to make sure the process of reading the file has no buffering at all.
I thought InputStreamReader
would be a good solution, until I read this in the Java documentation:
To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.
Ideally, only the bytes that are necessary would be read from the stream -- in other words, I don't want any buffering.
One of the problems involves writing a program that uses only around 1 megabyte of memory to store the contents of a file as a bit array with each bit representing whether or not a 7 digit number is present in the file.
This implies that you need to read the file as bytes (not characters).
Assuming that you do have a genuine requirement to read from a file without buffering, then you should use the FileInputStream
class. It does no buffering. It reads (or attempts to read) precisely the number of bytes that you asked for.
If you then need to convert those bytes to characters, you could do this by applying the appropriate String
constructor to a byte
or byte[]
. Note that for multibyte character encodings such as UTF-8
, you would need to read sufficient bytes to complete each character. Doing that without the possibility of read-ahead is a bit tricky ... and entails "knowledge* of the character encoding you are reading.
(You could avoid that knowledge by using a CharsetDecoder
directly. But then you'd need to use the decode
method that operates on Buffer
objects, and that is a bit complicated too.)
For what it is worth, Java makes a clear distinction between stream-of-byte and stream-of-character I/O. The former is supported by InputStream
and OutputStream
, and the latter by Reader
and Write
. The InputStreamReader
class is a Reader
, that adapts an InputStream
. You should not be considering using it for an application that wants to read stuff byte-wise.