I'm attempting to use Python's tarfile module to extract a tar.gz archive.
I'd like the extraction to overwrite any target files it they already exist - this is tarfile's normal behaviour.
However, I'm hitting a snitch in that some of the files have write-protection on (e.g. chmod 550).
The tarfile.extractall()
operation actually fails:
IOError: [Errno 13] Permission denied '/foo/bar/file'
If I try to delete the files from the normal command-line, I can do it, I just need to answer a prompt:
$ rm <filename>
rm: <filename>: override protection 550 (yes/no)? yes
The normal GNU tar utility also handles these files effortlessly - it just overwrites them when you extract.
My user is the owner of the files, so it wouldn't be hard to recursively chmod the target files before running tarfile.extractall. Or I can use shutil.rmtree to blow away the target beforehand, which is the workaround I'm using now.. However, that feels a little hackish.
Is there a more Pythonic way of handle overwriting read-only files within tarfile, using exceptions, or something similar?
I was able to get Mike's Steder's code to work like this:
You could loop over the members of the tarball and extract / handle errors on each file:
In a modern version of Python I'd use the
with
statement:If you can't use
with
just replace thewith
statement block with:If your tar ball is gzipped there's a quick shortcut to handle that with just:
It would be nicer if
tarfile.extractall
had an overwrite option.