More refactoring goodness

tsee on 2009-06-14T16:50:40

Writing code that modifies code is a difficult task. Writing code that modifies Perl code is a horrible task. Thankfully, writing Perl code that modifies Perl code is not quite as horrible as it could be, thanks to Adam Kennedy's PPI.

One stated goal of the Padre project is to provide refactoring tools for Perl code as well as reasonably possible. So far, there's shortcuts for replacing a variable in its lexical scope, finding a variable's declaration (be it a lexical, file-scoped (our), or package variable declared with 'use vars'), finding unmatched braces, and aligning code blocks on operators. These features are all useful, but they're only a subset of what more mature projects like Eclipse provide. A recent post on perlmonks discusses some examples of refactoring tools (or strategies) and their applicability to different languages. One of these is the Introduce Explaining Variable pattern. It's now implemented in Padre trunk. It's really quite simple, let me explain with an example:

The following code implements the derivative of the atan2 function. The code is from the Math::Symbolic::Derivative module. (I wrote it, so I'm complaining about my own cruft.) This basically implements the equation that is shown in the highlighted comment.

sub _derive_atan2 { my ( $tree, $var, $cloned, $d_sub ) = @_; # d/df atan(y/x) = x^2/(x^2+y^2) * (d/df y/x) my ($op1, $op2) = @{$tree->{operands}}; my $inner = $d_sub->( $op1->new()/$op2->new(), $var, 0 ); # templates my $two = Math::Symbolic::Constant->new(2); my $op = Math::Symbolic::Operator->new('+', $two, $two); my $result = $op->new('*', $op->new('/', $op->new('^', $op2->new(), $two->new()), $op->new( '+', $op->new('^', $op2->new(), $two->new()), $op->new('^', $op1->new(), $two->new()) ) ), $inner ); return $result; }

Now, this is pretty hard to read. The $op1 and $op2 variables correspond to the function operands y and x respectively. $d_sub is a closure that can derive recursively. The two templates are simply a shorthand so I didn't have to write someclass->new(...) repeatedly. To make x and y more apparent and to name $d_sub more fitting to its purpose, I open up the file in Padre, right-click each of those variables, select Lexically Replace Variable from the context menu, and provide the new names. Similarly, I replace $inner. This yields:

sub _derive_atan2 { my ( $tree, $var, $cloned, $derive ) = @_; # d/df atan(y/x) = x^2/(x^2+y^2) * (d/df y/x) my ($y, $x) = @{$tree->{operands}}; my $inner_derivative = $derive->( $y->new()/$x->new(), $var, 0 ); # templates my $two = Math::Symbolic::Constant->new(2); my $op = Math::Symbolic::Operator->new('+', $two, $two); my $result = $op->new('*', $op->new('/', $op->new('^', $x->new(), $two->new()), $op->new( '+', $op->new('^', $x->new(), $two->new()), $op->new('^', $y->new(), $two->new()) ) ), $inner_derivative ); return $result; }

Of course, that leaves the giant expression intact which actually calculates the result. It makes sense to add a few more temporary variables with descriptive names. I select $op->new('^', $x->new(), $two->new()) in the above version of the code, right-click, and select Insert Temporary Variable. Then I type the name of the new variable $x_square. Padre finds the beginning of the current statement for me and inserts a temporary variable declaration for $x_square at that point. It also replaces the selected text with $x_square. I manually replace another occurrance of the new temporary and then select $op->new('^', $y->new(), $two->new()) and have it replaced with $y_square accordingly. There's more that can be cleaned up, but this handful of clicks and practically no typing has improved the code's readability considerably:

sub _derive_atan2 { my ( $tree, $var, $cloned, $derive ) = @_; # d/df atan(y/x) = x^2/(x^2+y^2) * (d/df y/x) my ($y, $x) = @{$tree->{operands}}; my $inner_derivative = $derive->( $y->new()/$x->new(), $var, 0 ); # templates my $two = Math::Symbolic::Constant->new(2); my $op = Math::Symbolic::Operator->new('+', $two, $two); my $x_square = $op->new('^', $x->new(), $two->new()); my $y_square = $op->new('^', $y->new(), $two->new()); my $result = $op->new('*', $op->new( '/', $x_square, $op->new('+', $x_square, $y_square) ), $inner_derivative ); return $result; }

Thus Padre helps me refactor crufty code with ease. Many more of these tiny helpers are planned. Stay tuned!

PS: If this didn't convince you, maybe you should just give it a shot. I had to wrestle use.perl for hours to get it to add the highlighting in the example code. If I could add screenshots of the real thing...