The impending death of BZip2

Alias on 2009-07-02T06:18:57

adam@svn:~/svn.ali.as/db$ ls -l
total 30884
-rw-r--r-- 1 adam adam 9558294 Jul  2 03:45 cpandb.gz
-rw-r--r-- 1 adam adam 8538979 Jul  2 03:45 cpandb.bz2
-rw-r--r-- 1 adam adam 5960155 Jul  2 03:45 cpandb.lz
-rw-r--r-- 1 adam adam 3014480 Jun 30 06:46 cpanmeta.gz
-rw-r--r-- 1 adam adam 2658756 Jun 30 06:46 cpanmeta.bz2
-rw-r--r-- 1 adam adam 1825600 Jun 30 06:46 cpanmeta.lz


In my book, it's been dead for years

Mr. Muskrat on 2009-07-02T11:36:07

BZip2 has always been too slow for my usage. Most of the time I simply rely on gzip even though it cannot compress as much. Has lz really come that far and if so, how is the speed?

lrzip is even better

Ed Avis on 2009-07-02T11:54:00

I like lrzip which does a sorting step before LZMA compression, usually giving even better space/speed tradeoff.

XZ is the other plain LZMA-based compressor; it's not clear why both it and lzip need to exist, but hopefully one of them will evolve to support both file formats and thus become the winner.

gzip

bart on 2009-07-02T14:21:05

So, by the same measure... why exactly is gzip still around?

Re:gzip

Alias on 2009-07-02T17:10:57

gzip is fast, easy to implement, and uses almost no memory.

It's easily streamable and you get great bang for your buck so you can do it easily on the fly.

bzip2 is heavier, a lot slower, and only adds a fraction additional reduction (10-20%).

lzma is asymmetrical. It's a LOT more expensive on the compression side, and both sides use a lot more memory. But the decompression code is very small and FASTER than bzip2.

So as long as you have memory (desktop, server) it's much smaller than bzip2, and it's faster as well. Which means for packaged software and other files, lzma is much better.

But for resource-constrained situations, times you need a lot of speed, or anything with on-the-fly-compression, gzip is still the lowest common denominator and the best choice.