How java.io.Buffer* stream differs from normal str

2019-04-20 04:24发布

问题:

1) How does buffered streams work in background, how do they differ from normal streams and what are the advantage(s) of using them?

2) DataInputStream is also Byte based. But it is having methods to readLine(). What's the point in here?

回答1:

From the BufferedInputStream javadoc:

A BufferedInputStream adds functionality to another input stream-namely, the ability to buffer the input and to support the mark and reset methods. When the BufferedInputStream is created, an internal buffer array is created. As bytes from the stream are read or skipped, the internal buffer is refilled as necessary from the contained input stream, many bytes at a time. The mark operation remembers a point in the input stream and the reset operation causes all the bytes read since the most recent mark operation to be reread before new bytes are taken from the contained input stream.

Internally a buffer array is used and instead of reading bytes individually from the underlying input stream enough bytes are read to fill the buffer. This generally results in faster performance as less reads are required on the underlying input stream.

The opposite is then true for BufferedOutputStream.

mark() and reset() could be used as follows:

1 BufferedInputStream bis = new BufferedInputStream(is);
2 byte[] b = new byte[4];
3 bis.read(b); // read 4 bytes into b
4 bis.mark(10); // mark the stream at the current position - we can read 10 bytes before the mark point becomes invalid
5 bis.read(b); // read another 4 bytes into b
6 bis.reset(); // resets the position in the stream back to when mark was called
7 bis.read(b); // re-read the same 4 bytes as line 5 into b

To explain mark/reset some more...

The BufferInputStream internally remembers the current position in the buffer. As you read bytes the position will increment. A call to mark(10) will save the current position. Subsequent calls to read will continue to increment the current position but a call to reset will set the current position back to its value when mark was called.

The argument to mark specifies how many bytes you can read after calling mark before the mark position gets invalidated. Once the mark position is invalidated you can no longer call reset to return to it.

For example, if mark(2) had been used in line 4 above an IOException would be thrown when reset() is called on line 6 as the mark position would have been invalidated since we read more than 2 bytes.



回答2:

Buffered Readers/Writers/InputStreams/OutputStreams read and write to the OS in large chunks for optimization. In case of writers and outputstreams, the data is buffered in memory until there is enough collected to write out a big chunk. In case of readers and inputstreams, a large chunk is read form disk/network/... into the buffer and all reads are done from that buffer until the buffer is empty, and a new chunk is read in.

DataInputStream is indeed byte based. The readLine method is deprecated. Internally it reads bytes from disk/network/... byte-for-byte until it has collected a complete line. So this stream could be sped up by using a BufferedInputStream as it's source, such that the bytes for the line are read from the in-memory buffer instead of directly from disk.



回答3:

With un-buffered I/O each read or write request is passed directly to the Operating System. Java's buffered I/O streams read and write data to their own memory buffer (usually a byte array). Calls to the Operating System are only made when the buffer is empty (when doing reads) or the buffer is full (when doing writes). It is sometimes a good idea to flush the buffer manually after critical points in your application.

Since the Operating System API calls may result in disk access, network activity and the like, this can be quite expensive. Using buffers to batch the native Operating System I/O into larger chunks often significantly improves performance.



回答4:

Buffered streams write or read data in larger chunks by – nomen est omen – buffering. Depending on the underlying stream, this can increase performance dramatically.

From java.io.BufferedOutputStream's Javadocs:

By setting up such an output stream, an application can write bytes to the underlying output stream without necessarily causing a call to the underlying system for each byte written.



回答5:

To reduce this kind of overhead, the Java platform implements buffered I/O streams. Buffered input streams read data from a memory area known as a buffer; the native input API is called only when the buffer is empty. Similarly, buffered output streams write data to a buffer, and the native output API is called only when the buffer is full.