NHL datamining

ethan on 2004-05-06T06:36:16

I am extremely pleased with myself. I have overcome those obstacles that prevented me from reading and manipulating my NHL93 leagues. Now I can create nice tables such as this one:

    ethan@ethan:~/Projects/Games-NHL93/scripts$ perl team.pl PHI
                       Philadelphia Flyers
    Pos  Num  Name                  GP  G   A   PT  PIM   +/-
    ------------------------------------------------------------
    C    88   Eric Lindros          23  12  14  26  11      4
    R    8    Mark Recchi           23  12  8   20  16      2
    R    11   Kevin Dineen          23  11  9   20  21      1
    C    17   Rod Brind'Amour       23  10  10  20  12     -1
    R    27   Jim Cummins           20  10  10  20  4      17
    C    22   Viatcheslav Butsayev  23  3   9   12  2      -8
    D    34   Greg Hawgood          23  1   9   10  18     -7
    D    29   Terry Carkner         23  1   8   9   14      5
    L    18   Brent Fedyk           23  3   6   9   5       1
    D    2    Dimitri Yushkevich    23  1   8   9   8      -8
    L    9    Pelle Eklund          23  5   3   8   4      -6
    D    4    Mikhail Tatarinov     20  0   8   8   10     18
    R    14   Dave Snuggerud        23  4   2   6   0      -7
    C    25   Keith Acton           23  2   3   5   2       1
    L    15   Doug Evans            23  3   2   5   2       2
    D    41   Ryan McGill           23  2   3   5   13     11
    D    3    Garry Galley          23  1   3   4   10      0
    R    21   Dave Brown            23  1   2   3   0       2
    C    12   Neil Brady            2   0   1   1   0       0
    D    44   Shawn Cronin          0   0   0   0   0       0
    L    10   Claude Boivin         0   0   0   0   0       0
    L    23   Andrei Lomakin        0   0   0   0   0       0
    R    46   Allan Conroy          0   0   0   0   0       0
    D    26   Gord Hynes            0   0   0   0   0       0
    ------------------------------------------------------------
    Pos  Num  Name                  GP  MIN    W   L   T     PCT
    ------------------------------------------------------------
    G    33   Dominic Roussel       14  544    6   5   2   0.915
    G    30   Tommy Soderstrom      9   265    3   3   0   0.911
    G    35   Stephane Beauregard   0   0      0   0   0   0.000


with this code (it was a nice recapitulation of formats for me):



use Games::NHL93;

my $nhl = Games::NHL93->new( -from => "/mnt/win/windows/Desktop/nhl93/all.lp" );

my $team = $nhl->teams->get_team(shift);

my @all = $team->get_players;

my @players = sort { $b->pt <=> $a->pt or $b->gp <=> $a->gp } grep !$_->goalie, @all;

my @goalies = sort { $b->min <=> $a->min } grep $_->goalie, @all;

format STDOUT = @|||||||||||||||||||||||||||||| $team->name Pos Num Name GP G A PT PIM +/- ------------------------------------------------------------ @<<< @<<< @<<<<<<<<<<<<<<<<<<<< @<< @<< @<< @<< @<<< @>>> ~~ &pos,&num,&name, &gp,&g, &a, &pt,&pim,&plm ------------------------------------------------------------ Pos Num Name GP MIN W L T PCT ------------------------------------------------------------ @<<< @<<< @<<<<<<<<<<<<<<<<<<<< @<< @<<<< @<< @<< @<< @>>>> ~~ &gpos,&gnum,&gname, &ggp,&min,&w,&l,&t, &pct .

write STDOUT;

sub pos { $players[0] && $players[0]->pos } sub num { $players[0] && $players[0]->num } sub name { $players[0] && $players[0]->name } sub gp { $players[0] && $players[0]->gp } sub g { $players[0] && $players[0]->g } sub a { $players[0] && $players[0]->a } sub pt { $players[0] && $players[0]->pt } sub pim { $players[0] && $players[0]->pim } sub plm { $players[0] && (shift @players)->plus_minus } sub gpos { $goalies[0] && $goalies[0]->pos } sub gnum { $goalies[0] && $goalies[0]->num } sub gname { $goalies[0] && $goalies[0]->name } sub ggp { $goalies[0] && $goalies[0]->gp } sub min { $goalies[0] && $goalies[0]->min } sub w { $goalies[0] && $goalies[0]->w } sub l { $goalies[0] && $goalies[0]->l } sub t { $goalies[0] && $goalies[0]->t } sub pct { if ($goalies[0]) { my $g = shift @goalies; my $pct = sprintf("%.3f", $g->pct || "0.000"); return $pct; } }


The interface of the module though is still giving me some headache. The information I want access to is scattered over many different files. Especially read-access will therefore be tricky to get right considering that there are still many single bytes whose purpose I don't know.

Also, there are still some things that I don't understand. I can increase some of the players' attributes to very high values (such as endurance). When trying that with shot-power though, odd things happen. Some values seem to be coupled with others. If I set speed and shot-power very high, the player will suddenly move extremely slowly and each shot virtually backfires into the opposite direction. If I lower speed a bit (or the player just fatigues), it starts working again. Maybe some values are treated as signed, others as unsigned. This would still leave some of the phenomenons unexplained.


Windows

pudge on 2004-05-11T23:32:23

Never before have I wished I used Windows as much as I do right now. :-)

Re:Windows

ethan on 2004-05-12T04:51:44

Hmmh, why? I don't see how this comment relates to the journal entry it is commenting on. :-)

Re:Windows

pudge on 2004-05-12T05:03:23

Because it's a Windows game ...

Re:Windows

ethan on 2004-05-12T05:28:50

Ah, ok!

Provided you have a fast computer (my Athlon 9k is too slow, but it works fine on recent processors), NHL93 will run flawlessly on dosbox which is much better than dosemu.

You'll also need that for Windows XP (and possibly 2k). I think the only Windowses where such old DOS games can run natively are in the Win9X/ME family.

Re:Windows

ethan on 2004-05-12T07:41:24

my Athlon 9k is too slow

Heh, if I had one, it might just be fast enough. :-)
Of course, I meant to write Ahtlon 900.

Re:Windows

pudge on 2004-05-12T14:51:56

I only have one x86, and it is a 600 MHz Pentium or something. Oh, but I see it runs on Mac OS X ... hmmmmm ... :-)

AUTOLOAD

pudge on 2004-05-11T23:41:19

BTW, if it is me, I do:
sub AUTOLOAD {
    (my $name = $AUTOLOAD) =~ s/^.*::(\w+)$/$1/;
    return unless $name =~ /^(g?)(?:pos|num|name)$/i;  # etc.
 
        my $player = $1 ? $goalies[0] : $players[0];
        $name = $2;
    $player && $player->$name;
}
Or something like that. You'd need to special case a few things, still.

Re:AUTOLOAD

ethan on 2004-05-12T04:46:19

You are right. These repetitive functions actually cry for AUTOLOAD. Didn't occur to me by the time I was writing it, maybe because those are functions and not methods.

Games::NHL93 makes excessive use of AUTOLOAD, though. Every attribute and statistic figure is accessed that way which saved me the writing of many boring accessors.