Python: tarfile stream

2019-02-27 11:49发布

问题:

I would like to read some files from a tarball and save it to a new tarball. This is the code I wrote.

archive = 'dum/2164/archive.tar'

# Read input data.
input_tar = tarfile.open(archive, 'r|')
tarinfo = input_tar.next()
input_tar.close()

# Write output file.
output_tar = tarfile.open('foo.tar', 'w|')
output_tar.addfile(tarinfo)
output_tar.close()

Unfortunately, the output tarball is no good:

$ tar tf foo.tar
./1QZP_A--2JED_A--not_reformatted.dat.bz2
tar: Truncated input file (needed 1548288 bytes, only 1545728 available)
tar: Error exit delayed from previous errors.

Any clue how to read and write tarballs on the fly with Python?

回答1:

OK so this is how I managed to do it.

archive = 'dum/2164/archive.tar'

# Read input data.
input_tar = tarfile.open(archive, 'r|')
tarinfo = input_tar.next()
fileobj = input_tar.extractfile(tarinfo)

# Write output file.
output_tar = tarfile.open('foo.tar', 'w|')
output_tar.addfile(tarinfo, fileobj)

input_tar.close()
output_tar.close()