A few weeks ago, we implemented a reporting system. Since we're huge believers in agile methodologies, we threw things together fairly quickly, using TDD. This exposed two problems. First, the "class" we implemented was a sloppy mess of procedural code with a OO wrapper. Second, the tests were a similarly sloppy mess of procedural code.
The last few days, I've had to extend this and I found it very hard to do. I've rewritten the tests to be data driven and I now iterate over a list of known states. They're much easier to work with and understand, but they also make it clear that some of our code has needless duplication.
The big win, though, was rewriting the "class" to actually be a class. Instead of huge methods with a bunch of if/else statements, I relied on proper method dispatching. The resulting code is shorter (278 lines of code are now 101 lines of code), easier to read, and exposed some confusion in the underlying thinking. The big issue, though, was that I was left with a single "if" statement:
if ( 'yesterday' eq $time_period ) { $date->subtract( days => 1 ); }
I went from eight "if" statements to only one. The previous "if" statements were all failures of our code to take advantage of the dispatching capabilities of OO. The final "if" statement, however, was bugging the heck out of me. I realize now that it means we've hobbled ourselves. What if we want three days ago? Nine days ago? It turns out that we really can't do that. Fixing this is easy but it does require a change in the API. Once again, conditional checks in OO code prove themselves to be code smells.
Legitimate conditional checks should (I'm not a purist) generally be on pure booleans, not calculated ones. The condition above is faulty, but the following condition is reasonable:
if ( $debug ) { $self->log($message); }
That's a pure boolean and is the sort of thing I don't really have an issue with. Calculated booleans in OO code always give me pause.
Thanks for the heuristic! I never realized that before! I think this will help improve my coding.
(If there's a larger source where you picked that up, I'd be benefitted by a pointer.)
Re:Code smell heuristic
Ovid on 2008-04-29T16:09:59
(If there's a larger source where you picked that up, I'd be benefitted by a pointer.)Thought I've later found that others point this out, I doubt how I learned this will help you
:) Re:Code smell heuristic
Ovid on 2008-04-29T16:42:35
I figured a concrete example would help. Imagine this awful method from some Catalyst code I was working on:
sub _set_group_by : Private {
my ( $self, $c ) = @_;
my $group_by = 'day';
my $period = $c->stash->{report_metadata}{period};
my $report_range
= $c->stash->{report_metadata}{start}->day_month_year
. ' – '
. $c->stash->{report_metadata}{end}->day_month_year;
if ($period eq 'today' || $period eq 'yesterday') {
$group_by = 'hour';
$report_range = $c->stash->{report_metadata}{start}->day_month_year;
}
$c->stash->{group_by} = $group_by;
$c->stash->{report_range} = $report_range;
return;
}As you can see, we have a nasty conditional which basically throws away the work previously done. We could turn that into an if/else block and at least ensure that the assignments are only done once, but as it turns out, the core problem was this:
sub report : Path {
my ( $self, $c, @args ) = @_;
$c->stash->{report_metadata}{period} = $args[0];
$c->forward('_set_start_and_end_dates');
$c->forward('_set_group_by');
}First, we have the problem that the "period" can now be just about anything and we have to manually check it. Second, because we do things manually, we have that nasty if/else block. Instead, I wrote something like this:
sub report_day : LocalRegex('^(yesterday|today)/$') {
my ( $self, $c, @args ) = @_;
$c->stash->{report_metadata}{period} = $c->req->captures->[0];
$c->forward('_set_start_and_end_dates');
$c->stash->{group_by} = 'hour';
$c->stash->{report_range}
= $c->stash->{report_metadata}{start}->day_month_year;
}
sub report_month : LocalRegex('^(last7days|month)/$') {
my ( $self, $c, @args ) = @_;
$c->stash->{report_metadata}{period} = $c->req->captures->[0];
$c->forward('_set_start_and_end_dates');
$c->stash->{group_by} = 'day';
$c->stash->{report_range}
= $c->stash->{report_metadata}{start}->day_month_year
. ' – '
. $c->stash->{report_metadata}{end}->day_month_year;
}That's still pretty ugly, but the '_set_group_by' method goes away and the conditionals are eliminated. It also clarifies the code and groups the needed information together. After a lot of this was done, the simplification was revealing all sorts of underlying problems that our complex code was masking. The big win, though, is that we no longer need to check whether or not our 'period' is valid because that's handled by the method dispatching! (This is obscured in this instance because it's Catalyst, but Catalyst really can save you a lot of pain here)
Re:Code smell heuristic
theguvnor on 2008-04-29T20:34:04
Thanks for the example, but I'm still confused... take pity on this dim one, and explain why the conditional check hasn't just moved elsewhere in your code where you decide which method to call? Or is that the entire point?Re:Code smell heuristic
Ovid on 2008-04-29T21:42:13
In this case, it's done via a customer selecting a URL, so it's up to the customer to decide what they want and we don't have to.
A better example would be with Java:
public void setName(String name) {
this.name = name;
}
public void setName(String first, String last) {
this.name = first + " " + last;
}In regular Perl (no modules), it sometimes gets mistakenly rewritten as this:
sub name {
my ( $self, $first, $last ) = @_;
$self->{name} = $last ? "$first $last" : $first;
}While that's a silly example, it means that you cannot set the last name to a value which evaluates as false. Variants of this bug occur all the time because every conditional you embed has a great chance of adding bugs. With Java, conditional argument handling bugs are much less common (but the methods bind at compile time instead of runtime, so it's a trade-off, but that's another rant
:). Re:Code smell heuristic
jplindstrom on 2008-04-29T20:59:32
Remove duplication. Name things. These things go together.
Look at the method names. Do they really focus on what the methods do? Where is the axis of duplication? (hint: group by)
Re:Code smell heuristic
Ovid on 2008-04-29T21:35:02
This was a tiny example and merely a first-pass refactor. If you look at the actual code I checked in, you'll see it's much cleaner.
Re:Code smell heuristic
Aristotle on 2008-04-29T22:03:14
sub report : Chained('somewhere?') PathPart('') CaptureArgs(0) {
my ( $self, $c, @args ) = @_;
$c->action->chain->[-1]->name =~/\Areport_for_(.*)\z/
or die 'Chain ends at action with malformed name';
$c->stash->{report_metadata}{period} = $1;
$c->forward('_set_start_and_end_dates'); # inline this?
}
sub report_day : Chained('report') PathPart('') CaptureArgs(0) {
my ( $self, $c, @args ) = @_;
$c->stash->{group_by} = 'hour';
$c->stash->{report_range}
= $c->stash->{report_metadata}{start}->day_month_year;
}
sub report_month : Chained('report') PathPart('') CaptureArgs(0) {
my ( $self, $c, @args ) = @_;
$c->stash->{group_by} = 'day';
$c->stash->{report_range}
= $c->stash->{report_metadata}{start}->day_month_year
. ' – '
. $c->stash->{report_metadata}{end}->day_month_year;
}
BEGIN {
my %report = (
today => 'day',
yesterday => 'day',
last7days => 'month',
month => 'month',
);
while ( my ( $period, $type ) = each %report ) {
eval(
"sub report_for_$period:"
. " Chained('report_$type')"
. " PathPart('$period')"
. ' Args(0) {}'
);
}
}Re:Code smell heuristic
jplindstrom on 2008-04-30T18:42:56
Ctrl-C + Ctrl-V
Re:Too much work
Ovid on 2008-04-30T07:03:59
The conditional is a point I make in my OO guidelines writeup on PM. It's bad practice. I make the same point here, too. So yes, I do agree with you, but I plead mea culpa of posting a bad example to explain good code
;) Re:Too much work
hanekomu on 2008-04-30T17:07:33
You could get rid of the "if ($debug) { $self->log(...) }" by using Class::Null. That is, the log object (assuming it's an object) should be Class::Null by default and a real logging object if $debug is set. Then just do "$self->log(...)" without the conditional, and if it's a Class::Null object, the call is more or less ignored. The documentation of Class::Null actually uses this example.