comment on MJD's Clubbing someone to death with a loaded Uzi

bart on 2008-01-10T22:51:42

Mark Jason Dominus is a Perl hacker and book author with a blog. In one of his latest entries, Clubbing someone to death with a loaded Uzi, he rather harshly critiques other people's (beginners) code. But whenever you do that, you should make sure your replacement code isn't dodgy itself.

There is no way to let people comment on the blog, so I'm posting my remarks here.

He writes:

It could have been written like this:
printf FILE "$LOCATION{$location}\,";
printf FILE "%4s", "$min3\,";
printf FILE "%4s", "$max3\,";
printf FILE "%1s", "$wx3\n";
Eww. There's a few red flags in there:
  • Don't use printf where your data is used as a template. Granted, in this example, the data comes from a hash that is initialized with literal data stored in the script, but projects tend to evolve, and data just is not a template. If ever the data would contain a "%" sign, this code will blow up.
  • Why leave in first joining variables with other characters (such as commas), and then format the result with printf? Put the comma in the template.
  • There's no need for this to be broken up into 4 statements.
In summary: this code could have become:
printf FILE "%s,%3s,%3s,%1s\n", $LOCATION{$location}, $min3, $max, $wx3;
which is quite a bit shorter, and cleaner too, IMO.

But there are still more things wrong with the original code that he didn't discuss. For example:
foreach $location_name (%LOCATION ) {
    $location_code = $LOCATION{$location_name};
    ...
H*ll, if you want to loop over the keys of a hash, at least don't loop over both the keys and the values! Granted, in MJD's replacement code, the loop is gone, so this problem has disappeared too, but this is an major mistake that shouldn't just be skipped over.
foreach $location_name (keys %LOCATION ) {
    ...
Otherwise, if ever one of your hash values is also used as a key in this hash, you'll get noise output.

I am guessing that one reason why this person bothers to loop over a hash trying to find a particular hash item, may be to avoid outputting anything in case the item isn't in the hash. MJD just drops this, and outputs a line anyway. So I'm adding that conditional back in:
printf FILE "%s,%3s,%3s,%1s\n", $LOCATION{$location}, $min3, $max, $wx3
  if exists $LOCATION{$location};


There. Comments? Anything that I overlooked?


Simplicity

petdance on 2008-01-11T03:28:30

I think that Mark's intent was not to show how it would have been refactored completely, but rather to show the difference as it related to his original point. I'm certain that he left the code the way it was as a conscious choice.

MJD is certainly adept at spotting red flags.

The point.

Aristotle on 2008-01-11T04:51:22

Anything that I overlooked?

Yes: the point of his post. :-)

It might have been useful if he had added a note to the effect that “this is still not very good, but it’s the first step to cleaning it up” – but a complete cleanup wasn’t what he set out to write about.

Re:The point.

bart on 2008-01-11T10:57:00

Well, I beg to differ. In my draft for this post I had originally written (and next dropped it as overly harsh):

When you touch up code, any junk that you leave behind in statements that you touch, becomes your junk.
Leaving in the statement

printf FILE "$LOCATION{$location}\,";
now becomes his code.

understanding beginners' code

mr_bean on 2008-01-11T10:36:54

The other way of looking at the beginner's confused code would be to see it as an interaction between building on what you know and pulling rabbits out of a hat, and the hope that the two will eventually meet in the middle.

Too harsh

jtrammell on 2008-01-11T15:46:37

It's clear that MJD is addressing issue of the loop logic ("iterating over the keys of a hash"), and not what is done in the loop.

I thought...

phillup on 2008-01-11T16:56:54

But there are still more things wrong with the original code that he didn't discuss. For example:

    foreach $location_name (%LOCATION ) {
        $location_code = $LOCATION{$location_name};
        ...
H*ll, if you want to loop over the keys of a hash, at least don't loop over both the keys and the values! Granted, in MJD's replacement code, the loop is gone, so this problem has disappeared too, but this is an major mistake that shouldn't just be skipped over.
Actually, I thought the entire point of that part of the article was to point out that the user was using a loop to find the element that they could have simply addressed directly... and (I suppose) that is why he didn't address other "issues" in the code. They weren't the point.

At least that was my interpretation...

Re:I thought...

jjore on 2008-01-13T01:40:22

I thought it was interesting that MJD refactored the loop incorrectly. The original loop had the logic that if the key didn't exist in the hash, nothing would happen. MJD's refactoring caused it to be unconditional. That might be intended but it wasn't mentioned.

the following bit would have been more accurate.

if ( exists $hash{foo} ) {
    ++ $hash{foo}
}