Writing append only gzipped log files in Python

2019-04-28 17:27发布

问题:

I am building a service where I log plain text format logs from several sources (one file per source). I do not intend to rotate these logs as they must be around forever.

To make these forever around files smaller I hope I could gzip them in fly. As they are log data, the files compress very well.

What is a good approach in Python to write append-only gzipped text files, so that the writing can be later resumed when service goes on and off? I am not that worried about losing few lines, but if gzip container itself breaks down and the file becomes unreadable that's no no.

Also, if it's no go, I can simply write them in as plain text without gzipping if it's not worth of the hassle.

回答1:

Note: On unix systems you should seriously consider using an external program, written for this exact task:

  • logrotate (rotates, compresses, and mails system logs)

You can set the number of rotations so high, that the first file would be deleted in 100 years or so.


In Python 2, logging.FileHandler takes an keyword argument encoding that can be set to bz2 or zlib.

This is because logging uses the codecs module, which in turn treats bz2 (or zlib) as encoding:

>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2") as fh:
...     fh.write("Hello World\n")

$ bzcat on-the-fly-compressed.txt.bz2 
Hello World

Python 3 version (although the docs mention bz2 as alias, you'll actually have to use bz2_codec - at least w/ 3.2.3):

>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2_codec") as fh:
...     fh.write(b"Hello World\n")

$ bzcat on-the-fly-compressed.txt.bz2 
Hello World


标签: python gzip