Is TarArchiveInputStream buffered
or unbuffered
inputstream
?
InputStream inputStream = new TarArchiveInputStream(new GZIPInputStream(new BufferedInputStream(new FileInputStream(file))));
Does this object of inputStream
store the whole entire file internally into the heap memory? Or is it just a pointer to a file and stores nothing into the memory?
Based on the source code of commons-compress.jar ver 1.4
,
What happens when we create an instance of TarArchiveInputStream?
Apart from other initialization, the important object that gets created is an instance of TarBuffer object, which internally has a byte[] blockBuffer
whose default size is (DEFAULT_RCDSIZE * 20)
i..e, 512*20 = 10 KB.
This TarBuffer
object actually performs the reads operations and the data into this blockBuffer
from the underlying tar
file as readblock() method gets called internally as we invoke TarArchiveInputStream.read(..)
Does the object of TarArchiveInputStream store the whole entire file internally into the heap memory?
No. In fact, in general, whenever we call the read method of an inputStream
if will try to get the data from application buffer if the stream is buffered. If the requested data is present it serves it from the buffer. If not, it signals the OS (through trap) to read the data from the OS file cache/disk and copy that into its buffer. (Memory mapped files is bit different where this copying into is not needed, but we will not mix that in our discussion).
This is true even in the case of TarArchiveInputStream
as well. As we call read
method on TarArchiveInputStream
it delegates to the inner inputStream
and the above same flow can be visualized.
Or is it just a pointer to a file and stores nothing into the memory?
While creating the TarArchiveInputStream
we pass an inputStream
as an argument and this inputStream
is, in fact, a pointer (as far as I could recollect, it is in inode number in *-nix OS and points to an actual inode structure) to the file.
It does store the content into the memory as explained before but not the entire file. How much data is read into the memory depends on the size of the byte[]
passed to the while invoking read(...)
method on TarArchiveInputStream
.
Also, if it helps, this is the link that I used to see how to read entries using TarArchiveInputStream
.