I am building a service where I log plain text format logs from several sources (one file per source). I do not intend to rotate these logs as they must be around forever.
To make these forever around files smaller I hope I could gzip them in fly. As they are log data, the files compress very well.
What is a good approach in Python to write append-only gzipped text files, so that the writing can be later resumed when service goes on and off? I am not that worried about losing few lines, but if gzip container itself breaks down and the file becomes unreadable that's no no.
Also, if it's no go, I can simply write them in as plain text without gzipping if it's not worth of the hassle.
Note: On unix systems you should seriously consider using an external program, written for this exact task:
logrotate
(rotates, compresses, and mails system logs)
You can set the number of rotations so high, that the first file would be deleted in 100 years or so.
In Python 2, logging.FileHandler
takes an keyword argument encoding
that can be set to bz2
or zlib
.
This is because logging
uses the codecs
module, which in turn treats bz2
(or zlib
) as encoding:
>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2") as fh:
... fh.write("Hello World\n")
$ bzcat on-the-fly-compressed.txt.bz2
Hello World
Python 3 version (although the docs mention bz2
as alias, you'll actually have to use bz2_codec
- at least w/ 3.2.3):
>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2_codec") as fh:
... fh.write(b"Hello World\n")
$ bzcat on-the-fly-compressed.txt.bz2
Hello World