Code Which Gives Me Pause

Ovid on 2007-06-18T11:54:50

In checking our code coverage stats, I noticed on condition for a module needed to be tested to bring coverage to 100%. Then I looked at the line of code:

my $country = $card->country;
$country = $country eq 'gbr' ? '' : uc "[$country]" if $country;

A ternary operator and statement modifier on the same line? You're packing three conditions onto one line of code! Some folks would think that's just fine, but it seems to obscure the intent. How would you rewrite that?


define it

barbie on 2007-06-18T12:07:13

I'm inclined to go for something more like:

my $country = $card->country;
$country = (!$country || $country eq 'gbr') ? '' : uc "[$country]";
Unless the difference between undef and an empty string matters.

That depends...

phaylon on 2007-06-18T12:59:54

...on where the vars are used. In that example, it looks like the $country scalar is only used for the check. But I might do it like this:

if (my $country = $code->country) {
  $country = ( $country eq 'gbr' ? '' : uc "[$country]" );
}
.phaylon

Something like this

Juerd on 2007-06-18T13:15:11

my $country = uc $card->country || '';
$country = ($country =~ /^$|GBR/ ? '' : "[$country]");
Possibly ^GBR$.

Re:Something like this

Aristotle on 2007-06-18T14:05:49

Yuck. Using a pattern match to test string equality is smelly.

Re:Something like this

Juerd on 2007-06-18T18:05:44

I can think of two reasons.

One is ugly syntax, which I find a silly argument. All Perl code is ugly, but elegance is more important than beauty. And I think this is very elegant.

The other is that you can only have 2_501_710 such matches per second (on my PM 1,5 GHz laptop, the average of 50% matches and 50% non-matches), which can be a valid argument in some cases. But if performance is the issue here, the method call should go first: my laptop can do just 1_415_988 method calls per second, and that's on an empty subroutine.

I wonder if you have a third reason for not using a regex here.

Re:Something like this

Aristotle on 2007-06-18T19:53:10

I am honestly somewhat alarmed that you could only come up with straw men. How about “a regex match always has many more subtleties and edge/corner cases than straight string comparison”?

As soon as I saw your code I wondered: “Did he mean ^GBR$ or was the omission of anchors deliberate?” And next: “Hmm, he uppercased it but there’s no /i – was that on purpose?”

I prefer to minimise the number of speed bumps that a casual reader of the code has to pass.

Re:Something like this

Juerd on 2007-06-18T22:46:01

As for the omission of anchors: probably a bad idea. I considered this and added a remark about it to the post. OTOH, I frequently have code like:

$lang =~ /nl|en|de|eo/
if I know $lang will always be 2 characters. Here, I didn't know, and should have used anchors.

The case sensitivity has nothing to do with regexes. With $foo eq 'GBR' you can (should) ask yourself the same question: shouldn't uc($foo) eq 'GBR' be used instead? However this is somewhat irrelevant here, as the line just before the match, an explicit "uc" was used. Whoever reads lines of code out of context, will be puzzled unnecessarily.

I also try to make my code easy to read, but apparently using different metrics for that. I found the double condition hard to grok, and abstracted that into a single regex.

If I were writing the same code, knowing that newbies would maintain it (I refuse to have this as a default style!), I would have written:

my $country = $card->country || '';

if ($country eq 'gbr') {
        $country = '';
} elsif ($country ne '') {
        $country = uc "[$country]";
}


But I try to avoid such repetitive code in anything written for fun or production.

Re:Something like this

Aristotle on 2007-06-19T00:27:42

I’m not advocating baby-talk code like that.

Note that neither of the variants you suggest preserves undefs/zeroes. For that case, I gave a much simpler version that folds both conditions into a single ternary.

I don’t see how putting two conditions inside a regex as alternation is abstraction. You don’t have two Perl expressions, but instead a regex with two match possibilities. The complexity hasn’t gone anywhere, it’s still there right in front of the reader’s eyes, it has just taken on a different shape.

Re:Something like this

Juerd on 2007-06-19T09:14:29

I am indeed assuming that a method called "country" will not return 0, and that undef is to be equal to "". The addition of square brackets, and the complete omission of the country indication when it's one specific value, tells me that this will be used in another string, meant for end users.

The complexity is still there entirely, but written in a more compact way: there's only one character in between. This makes it easier --for me-- to notice that in practice, both conditions lead to the same thing.

I don't think such a minor stylistic difference is worth being alarmed over...

Re:Something like this

Aristotle on 2007-06-19T22:48:15

OK, being alarmed is hyperbole... to an extent. This particular example does not warrant it, but on general principle I *would* be worried if the only reasons you could come up with for preferring string comparison over regexes are performance and ugly syntax.

Compare:

my $country = uc $card->country || '';
$country = ($country =~ /^$|GBR/ ? '' : "[$country]");

With:

my $country = $card->country;
$country = ( $country and $country ne 'gbr' ) ? uc "[$country]" : '';

Regardless of how you turn it, I doubt many people will agree with you that the former is easier to read than the latter – there are more interdependencies to pick up on in the former. It has two boolean expressions rather than one, and the uc on one line ties toegether with the regex match on the other. (I honestly only just noticed that, even though you previously explicitly pointed it out.) In the latter, each of the parts does less.

Btw, previously you wrote:

OTOH, I frequently have code like $lang =~ /nl|en|de|eo/ if I know $lang will always be 2 characters.

Yeah, this is preferrable over the non-regex version, I think (only barely, but it is), but once we have smartmatch, I would write it that way. (In Perl 6 I’d use a junction, since that reads even more nicely; which brings us full circle since it would look very much like your code.)

probably not like this

hdp on 2007-06-18T13:31:22

(but I can't resist anyway)
(my $country = $card->country || '') =~ s/^(?!gbr)(.+)$/\U$1/;

Much like barbie…

Aristotle on 2007-06-18T14:14:49

… only I’d write the conditional the other way around:

my $country = $card->country;
$country = ( $country and $country ne 'gbr' ) ? uc "[$country]" : '';

Re:Much like barbie…

Aristotle on 2007-06-18T14:24:54

Oops, that’s not actually the same. Well, it might be OK; if zeroes and undefs are not valid values that you need to care about, then that is how I’d write it.

If you do need to distinguish, though, then I’d put it this way:

my $country = $card->country;
$country
    = not $country          ? $country
    :     $country eq 'gbr' ? ''
    :                         uc "[$country]";

Re:Much like barbie…

Juerd on 2007-06-18T22:48:58

As for speedbumps... nested ternaries really slow /me/ down! (Except for simple 1-on-1 mappings.)

Re:Much like barbie…

Aristotle on 2007-06-19T00:20:17

Did you miss the part where I said this complex solution is only necessary if undef and zero need to be preserved? Sure this one is complex – because a three-way condition is always complex.

Re:Much like barbie…

Juerd on 2007-06-19T09:20:47

It is never necessary. Every such nested ternary can be written more verbosely, although you may have to use temporary variables.

my $country = $card->country;
if ($country) {
    $country = uc "[$country"];
    $country = "" if $country eq "[GBR]";
}
(I swapped the uc and gbr check around, because it makes the code prettier IMO.)

Though what's really needed in this code, regardless of the Perl syntax used, is a comment stating its intention. Perhaps:

# Don't mention the country if it's unknown or GBR
Because that is always more clear.

Re:Much like barbie…

Aristotle on 2007-06-19T23:01:28

Every such nested ternary can be written more verbosely, although you may have to use temporary variables.

Didn’t you just exclaim in another comment that you refuse to have this as a default style? Personally I find nested ternaries, if they’re laid out like a table, much easier to read than imperative code with a pile of temporary variables doing the same thing. Much easier to maintain, too.

Though what’s really needed in this code, regardless of the Perl syntax used, is a comment stating its intention.

In my book, comments are always a sign of big trouble for one reason or another. I only leave them behind to explain why I made a particular coding decision when it would seem weird to someone who doesn’t know the whole picture – eg. an odd bug workaround that a maintainer might be tempted to “correct”. I try very hard to avoid having to explain the code itself. The canonical authority on what the code does is the code itself – this is a cardinal rule.

If you need to write a comment for something as simple as the code we’re discussing, then something is very wrong. (Probably it’s that we’ve nitpicked the example to death.)

Re:Much like barbie…

Juerd on 2007-06-20T12:55:59

Yes, I did. But I would pick such a verbose style over short ?: tables.

As for comments: they provide redundancy. When the comment and the code are no longer in sync, that's a good sign something's wrong, and that helps to write good code.

where "use warnings" burns

merlyn on 2007-06-18T15:08:07

My guess is that sometimes, $country is undef. And because it might be, the author writes horribly perverted code just to avoid getting an irrelevant warning message. This is the downside of warnings.

And given that, I'd have written this code as: $country = $country eq 'gbr' ? '' : uc $country; See how much clearer it is when you stop worrying about stupid warnings? Don't be pedantic about warnings. Stop using them when they make your code look ugly. (/end rant).

Re:where "use warnings" burns

slanning on 2007-06-18T17:12:05

My interpretation....was that $country wasn't dealing with definedness, but was instead determining whether or not to use brackets. If there's nothing there, "[]" might look stupid, so you don't want to put the brackets; and if it's 'gbr', you don't want anything there for whatever reason.

Re:where “use warnings” burns

Aristotle on 2007-06-18T18:26:30

If it’s really just undef, then an or '' in the appropriate spot would suffice – no need for convolution.

That said, almost all my code starts like so:

use warnings;
no warnings qw( once qw );

Re:where “use warnings” burns

sigzero on 2007-06-21T11:58:30

I looked in the perllexwarn pod and it just shows "once" in the hierarchy. What exactly does that do?

Re:where “use warnings” burns

Ovid on 2007-06-21T12:10:37

Sometimes you refer to a variable in another file:

$CGI::DISABLE_UPLOADS = 1;

That snippet will prevent a DOS on your server from folks repeatedly uploading files. However, it's probably only in your code once and warnings will issue a warning about it which you can disable with no warnings 'once'.

It's designed, however, for cases where you declare something and never use it again. That's probably cruft that you don't actually want in your code (or worse, you declare something but refer to it by the wrong name later).

Re:where “use warnings” burns

Aristotle on 2007-06-21T12:51:42

It suppresses warnings like this:

Name "main::x" used only once: possible typo at -e line 1.

This is a poor man’s use strict 'vars': the idea is that if you used a variable just once, it’s probably because you meant to use another variable but mistyped its name. However, you will get this warning only about package variables, not about my lexicals.

These days, I generally use package variables only when I need to change a global setting in some module I’m using, such as the $CGI::DISABLE_UPLOADS variable Ovid mentioned. In that case, it’s a given that you set the variable once and never refer to it again, triggering the warning. So that warning is an annoyance that doesn’t serve any useful purpose and therefore I turn it off. I could work around it with stupid stuff like $CGI::DISABLE_UPLOADS = $CGI::DISABLE_UPLOADS = 1, which we used to do before Perl 5.6 because warnings could only be turned off completely, not selectively. But that just amounts to making silly faces at the compiler and I see no point in that.

Oh yeah – I know!

Aristotle on 2007-06-18T18:24:44

my $country = $card->country;
$country ||= $country eq 'gbr' ? '' : uc "[$country]";

SCNR… :-)

Re:Oh yeah – I know!

lbr on 2007-06-18T22:52:19

my $country = $card->country;
$country    = ''              if $country eq 'gbr';
$country    = uc "[$country]" if $country;

Keep it simple.

Re:Oh yeah – I know!

Aristotle on 2007-06-19T00:11:25

Except that this will cause warnings when $country starts out undefined.

Refactor

djberg96 on 2007-06-21T12:24:00

I'd refactor the Card class so that the default value for the getter was "GBR" and the setter automatically uc'd its argument. With that in place:

my $country = $card->country eq 'GBR' ? '' : "[$card->country]"

Re:Refactor

Aristotle on 2007-06-21T14:22:14

’cept you can’t interpolate method calls.