(un)?pack is magic to me

triv on 2002-05-29T05:22:35

I was hacking on Net::DNS today (I should get back to hacking on Dyn!), and I wrote this:

if ($string =~ m/^               
    	 		([a-zA-Z0-9]{2})  # AFI  ($1)
    	 		([a-zA-Z0-9]{4})  # IDI  ($2)
    	 		([a-zA-Z0-9]{2})  # DFI  ($3)
    	 		([a-zA-Z0-9]{6})  # AA   ($4)
    	 		([a-zA-Z0-9]{4})  # Rsvd ($5)
    	 		([a-zA-Z0-9]{4})  # RD   ($6)
    	 		([a-zA-Z0-9]{4})  # Area ($7)
    	 		([a-zA-Z0-9]{12}) # ID   ($8)
    	 		([a-zA-Z0-9]{2})  # Sel  ($9)
          	 /x)  
{
			$self->{'afi'}  = $1;
			$self->{'idi'}  = $2;
			$self->{'dfi'}  = $3;
			$self->{'aa'}   = $4;
			$self->{'rsvd'} = $5;
			$self->{'rd'}   = $6;
			$self->{'area'} = $7;
			$self->{'id'}   = $8;
			$self->{'sel'}  = $9;
}

Something deep down tells me I should be using unpack(), but I know and love regexen. Perhaps tomorrow I will get to know pack/unpack better.


Example

djberg96 on 2002-05-29T12:45:08

Unpack would definitely be less typing, and I believe it's considerably faster. However, unpack doesn't check for a valid string format. You could have extraneous characters (I'm guessing) at the end of the string and unpack would simply ignore them, which may or may not be the behavior you want.

Also, why [a-zA-Z0-9] instead of just \w?

#!/usr/bin/perl -w
use strict;

# A test of unpacking a Net::DNS string
# Format 2,4,2,6,4,4,4,12,2 letters/numbers

my @keys = qw/afi idi dfi aa rsvd rd area id sel/;

my $string = "xx2222yy777777zzzzaaaabbbb123456789012bb";

my @result = unpack("A2A4A2A6A4A4A4A12A2",$string);

my $self = {};

my $n = 0;
foreach(@keys){
   $self->{$_} = $result[$n];
   $n++;
}

while(my($key,$val) = each(%$self)){
   print "$key: $val\n";
}

Re:Example

vsergu on 2002-05-29T20:46:08

Also, why [a-zA-Z0-9] instead of just \w?

Maybe he's worried about underscores. Maybe he's even using locale.

Try a hash slice too:

Matts on 2002-05-29T13:01:38

if ($string =~ /^[a-zA-Z0-9]{40}$/) {
    @{ $self }{ qw(afi idi dfi aa rsvd rd area id sel) } =
         unpack("A2A4A2A6A4A4A4A12A2", $string);
}
Hope that helps!

Re:Try a hash slice too:

djberg96 on 2002-05-29T14:52:27

Nice. Vek, I compared your original regex with Matts' unpack + hash slice. Here are the results of a benchmark I ran:

Benchmark: timing 1000000 iterations of regex, unpak...

regex: 40 wallclock secs (38.95 usr + 0.00 sys = 38.95 CPU) @ 25673.94/s (n=1000000)
unpak: 2 wallclock secs ( 2.61 usr + 0.00 sys = 2.61 CPU) @ 383141.76/s (n=1000000)

I hit a 'lameness filter' or I would have posted the code as well but you should be able to replicate the results with little effort. BTW, this was on a Sunblade 100 (Sparc IIe) with 128mb RAM.

Re:Try a hash slice too:

djberg96 on 2002-05-29T14:59:59

Oops, I meant triv, not Vek. That's what I get for reading so many posts at once. Sorry 'bout that.

Re:Try a hash slice too:

triv on 2002-05-29T19:01:59

You are my new hero.