XBin (de)compression.

LTjake on 2003-03-25T15:24:18

I've been working on Image::XBin. XBin is a textmode file format, similar to a raw BIN file, but allows for custom palettes and fonts.

XBin files can be stored raw (a character byte followed by an attribute byte), or slightly compressed (by grouping identical, sequential elements).

If the data has been compressed, then the first byte you get is a compression byte. It's layed out like so:

  +---+---+---+---+---+---+---+---+
  | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
  +---+---+---+---+---+---+---+---+
ex  1   0   0   0   0   1   0   0

Bits 6 & 7 indicate the type of compression (i.e. the repeated element):

  • 00 - No Compression
  • 01 - Character Compression
  • 10 - Attribute Compression
  • 11 - Character & Attribute Compression

The other 6 bits indicate the repeat counter minus one. A repeat counter of 0, is in reality a repeat of 1. Thus, we can have a maximum repeat counter of 64. In the example above: 0 0 0 1 0 0 is 4, plus one is 5.

So, given an array of characters (@image) from an XBin file, this is what i came up with to decompress it (push character, attribute pairs onto a new array ($image)):

my $image = [];
my $x = -1;
while ( ++$x < @image ) {
   my $char = $image[$x];
   my $attr;

   # get compression bits
   my $type = $char & 192;

   # get counter bits
   my $counter = $char & 63;

   # no compression
   if ( $type == 0 ) {
       push @$image, [$image[++$x], $image[++$x]] for (0..$counter);
   }
   # character compression
   elsif ( $type == 64 ) {
       $char = $image[++$x];
       push @$image, [$char, $image[++$x]] for (0..$counter);
   }
   # attribute compression
   elsif ( $type == 128 ) {
       $attr = $image[++$x];
       push @$image, [$image[++$x], $attr] for (0..$counter);
   }
   # character & attribute compression
   else {
       $char = $image[++$x];
       $attr = $image[++$x];
       push @$image, [$char, $attr] for (0..$counter);
   }
}

This works great -- except I always end up with a few characters more than width * height for the uncompressed data. Perhaps there's a better way to do this algorithm?


FYI

belg4mit on 2003-03-25T21:33:35

by grouping identical, sequential elements

That's commmonly referred to as Run Length Encoding (RLE)