I am using Python 2.7. I have a .bz2 file, and I need to figure out the uncompressed file size of its component file without actually decompressing it. I have found ways to do this for gzip and tar files. Anyone know of a way for bz2 files?
Thanks very much
It seems that telling the size of bz2 file without actually decompressing it is impossible. See the link for more details and a possible solution: https://superuser.com/questions/53984/is-there-a-way-to-determine-the-decompressed-size-of-a-bz2-file
As the other answers have stated, this is not possible without decompressing the data. However, if the size of the decompressed data is large, this can be done by decompressing it in chunks and adding the size of the chunks:
Alternatively (and probably faster, though I haven't profiled this) you can
seek()
to the end of the file and then usetell()
to find out how long it is:I suspect this is impossible due to the nature of bz2 format and compressing techniques it uses. Here is a quite good description of the both format and the algorithms http://en.wikipedia.org/wiki/Bzip2#File_format
You will never know original data size until you decompress it.