In honor of the Hutter Prize, what are the top algorithms (and a quick description of each) for text compression?
Note: The intent of this question is to get a description of compression algorithms, not of compression programs.
In honor of the Hutter Prize, what are the top algorithms (and a quick description of each) for text compression?
Note: The intent of this question is to get a description of compression algorithms, not of compression programs.
The boundary-pushing compressors combine algorithms for insane results. Common algorithms include:
Maximum Compression is a pretty cool text and general compression benchmark site. Matt Mahoney publishes another benchmark. Mahoney's may be of particular interest because it lists the primary algorithm used per entry.
There's always lzip.
All kidding aside:
DEFLATE
algorithm) still wins.LZMA
algorithm) compresses very well and is available for under the LGPL. Few operating systems ship with built-in support, however.If you want to use PAQ as a program, you can install the zpaq
package on debian-based systems. Usage is (see also man zpaq
)
zpaq c archivename.zpaq file1 file2 file3
Compression was to about 1/10th of a zip file's size. (1.9M vs 15M)