Is there a way to write a string directly to a tarfile? From http://docs.python.org/library/tarfile.html it looks like only files already written to the file system can be added.
问题:
回答1:
I would say it's possible, by playing with TarInfo e TarFile.addfile passing a StringIO as a fileobject.
Very rough, but works
import tarfile
import StringIO
tar = tarfile.TarFile("test.tar","w")
string = StringIO.StringIO()
string.write("hello")
string.seek(0)
info = tarfile.TarInfo(name="foo")
info.size=len(string.buf)
tar.addfile(tarinfo=info, fileobj=string)
tar.close()
回答2:
As Stefano pointed out, you can use TarFile.addfile
and StringIO
.
import tarfile, StringIO
data = 'hello, world!'
tarinfo = tarfile.TarInfo('test.txt')
tarinfo.size = len(data)
tar = tarfile.open('test.tar', 'a')
tar.addfile(tarinfo, StringIO.StringIO(data))
tar.close()
You'll probably want to fill other fields of tarinfo
(e.g. mtime
, uname
etc.) as well.
回答3:
I found this looking how to serve in Django a just created in memory .tgz archive, may be somebody else will find my code usefull:
import tarfile
from io import BytesIO
def serve_file(request):
out = BytesIO()
tar = tarfile.open(mode = "w:gz", fileobj = out)
data = 'lala'.encode('utf-8')
file = BytesIO(data)
info = tarfile.TarInfo(name="1.txt")
info.size = len(data)
tar.addfile(tarinfo=info, fileobj=file)
tar.close()
response = HttpResponse(out.getvalue(), content_type='application/tgz')
response['Content-Disposition'] = 'attachment; filename=myfile.tgz'
return response
回答4:
Just for the record:
StringIO objects have a .len property.
No need to seek(0) and do len(foo.buf)
No need to keep the entire string around to do len() on, or God forbid, do the accounting yourself.
( Maybe it did not at the time the OP was written. )
回答5:
You have to use TarInfo objects and the addfile method instead of the usual add method:
from StringIO import StringIO
from tarfile import open, TarInfo
s = "Hello World!"
ti = TarInfo("test.txt")
ti.size = len(s)
tf = open("testtar.tar", "w")
tf.addfile(ti, StringIO(s))
回答6:
In my case I wanted to read from an existing tar file, append some data to the contents, and write it to a new file. Something like:
for ti in tar_in:
buf_in = tar.extractfile(ti)
buf_out = io.BytesIO()
size = buf_out.write(buf_in.read())
size += buf_out.write(other data)
buf_out.seek(0)
ti.size = size
tar_out.addfile(ti, fileobj=buf_out)
Extra code is needed for handling directories and links.
回答7:
The solution in Python 3 uses io.BytesIO
. Be sure to set TarInfo.size
to the length of the bytes, not the length of the string.
Given a single string, the simplest solution is to call .encode()
on it to obtain bytes. In this day and age you probably want UTF-8, but if the recipient is expecting a specific encoding, such as ASCII (i.e. no multi-byte characters), then use that instead.
import io
import tarfile
data = 'hello\n'.encode('utf8')
info = tarfile.TarInfo(name='foo.txt')
info.size = len(data)
with tarfile.TarFile('test.tar', 'w') as tar:
tar.addfile(info, io.BytesIO(data))
If you really need a writable string buffer, similar to the accepted answer by @Stefano Borini for Python 2, then the solution is to use io.TextIOWrapper
over an underlying io.BytesIO
buffer.
import io
import tarfile
textIO = io.TextIOWrapper(io.BytesIO(), encoding='utf8')
textIO.write('hello\n')
bytesIO = textIO.detach()
info = tarfile.TarInfo(name='foo.txt')
info.size = bytesIO.tell()
with tarfile.TarFile('test.tar', 'w') as tar:
bytesIO.seek(0)
tar.addfile(info, bytesIO)