XBin (de)compression.

LTjake on 2003-03-25T15:24:18

I've been working on Image::XBin. XBin is a textmode file format, similar to a raw BIN file, but allows for custom palettes and fonts.

XBin files can be stored raw (a character byte followed by an attribute byte), or slightly compressed (by grouping identical, sequential elements).

If the data has been compressed, then the first byte you get is a compression byte. It's layed out like so:

+---+---+---+---+---+---+---+---+ | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | +---+---+---+---+---+---+---+---+ ex 1 0 0 0 0 1 0 0

Bits 6 & 7 indicate the type of compression (i.e. the repeated element):

00 - No Compression
01 - Character Compression
10 - Attribute Compression
11 - Character & Attribute Compression

The other 6 bits indicate the repeat counter minus one. A repeat counter of 0, is in reality a repeat of 1. Thus, we can have a maximum repeat counter of 64. In the example above: 0 0 0 1 0 0 is 4, plus one is 5.

So, given an array of characters (@image) from an XBin file, this is what i came up with to decompress it (push character, attribute pairs onto a new array ($image)):

my $image = []; my $x = -1; while ( ++$x < @image ) { my $char = $image[$x]; my $attr; # get compression bits my $type = $char & 192; # get counter bits my $counter = $char & 63; # no compression if ( $type == 0 ) { push @$image, [$image[++$x], $image[++$x]] for (0..$counter); } # character compression elsif ( $type == 64 ) { $char = $image[++$x]; push @$image, [$char, $image[++$x]] for (0..$counter); } # attribute compression elsif ( $type == 128 ) { $attr = $image[++$x]; push @$image, [$image[++$x], $attr] for (0..$counter); } # character & attribute compression else { $char = $image[++$x]; $attr = $image[++$x]; push @$image, [$char, $attr] for (0..$counter); } }

This works great -- except I always end up with a few characters more than width * height for the uncompressed data. Perhaps there's a better way to do this algorithm?

FYI

belg4mit on 2003-03-25T21:33:35

by grouping identical, sequential elements

That's commmonly referred to as Run Length Encoding (RLE)