fluff: A Perlish sort of lint

petdance on 2005-02-18T19:58:31

Long ago, when I started on this code base that my department now watches over, I wrote a utility called fluff, akin to lint, which would look for naughty constructs in code. I think it originally would look for things like single-character variable names that weren't $i or $n.

It's grown and evolved over time. Now we add things like using modules that we've out grown (CGI.pm, Array::Iterator), or using a no_plan in our tests. Plus, last night I made it so that you can select what you want to fluff on from the command line. fluff [options] [files] --array-iterator using Array::Iterator --commented Commented-out code --croak Croak, should probably be an assertion --data Variables called "data" --dumper Data dumper diagnostics --else-one-line else and result on one line --if-one-line if and result on one line --mech-ok Using ok() for mech content instead of $mech->content_like --no_plan no_plan --oci OCI calls --pathing Up-n-over pathing --ref-proto ref proto || proto --return-undef Returning undef --review REVIEWed things to be addressed later --sqldo sqldo using vars, not binds --todo TODOed things to be addressed later --ttml Calling TT process on a .tt file (not .ttml) --using using CGI.pm --xxx XXXed things to be addressed later



Here's the code if anyone wants to steal, or have any suggestions. #!/usr/bin/perl -w

use strict; use File::Find; use Data::Hash::Totals; use List::Util qw( max ); use Getopt::Long;

=head1 TODO

All .ttml files should be twcollapsed

=cut

my %getopt_options = ( help => sub{ help(); exit; } ); my %checks;

for my $opt ( possible_errors() ) { my $tag = $opt->[0]; $getopt_options{$tag} = eval qq/ sub { \$checks{"$tag"} = 1 } /; }

GetOptions( %getopt_options ) or exit 1;

# Set all options on if none are explicitly set if ( !%checks ) { %checks = map {($_->[0],1)} possible_errors(); }

my %errors; if ( @ARGV ) { process_file($_) for @ARGV; } else { find( \&handler, "." ); }

my %totals; my $nerrors = 0; for my $pe ( possible_errors() ) { $totals{$pe->[1]} = 0 if $checks{ $pe->[0] }; }

for my $errstr ( sort keys %errors ) { print "$errstr\n"; my @errors = @{$errors{$errstr}}; print " $_\n" for @errors; print "\n"; $totals{$errstr} = @errors; $nerrors += @errors; }

print as_table( \%totals ); printf( "%4d Total\n", $nerrors );

sub handler { if ( -d ) { $File::Find::prune = 1 if /\b(\.svn|CVS)\b/; return; } return unless /\.(css|tt|ttml|t|pm|php|phpt|html)$/;

process_file( $_, $File::Find::name ); }

sub process_file { my $filename = shift; my $displayname = shift || $filename;

open( my $fh, $filename ) or die "Can't open $filename: $!"; my @checks = grep { $checks{$_->[0]} } possible_errors(); while ( my $line = <$fh> ) { for ( @checks ) { my ($tag,$desc,$finder) = @$_; if ( $line =~ $finder ) { chomp $line; push( @{$errors{$desc}}, "$displayname($.): $line" ); } } # for } # while close $fh; } # process_file()

sub help { print "fluff [options] [files]\n";

my @pe = possible_errors(); @pe = sort { $a->[0] cmp $b->[0] } @pe; my $max = max map { length $_->[0] } @pe;

for my $opt ( @pe ) { my ($tag,$desc) = @$opt; printf( " --%-*s %s\n", $max, $tag, $desc ); } }

sub possible_errors { return ( [ xxx => "XXXed things to be addressed later" => qr/\bXXX\b/ ], [ todo => "TODOed things to be addressed later" => qr/TODO/ ], [ review => "REVIEWed things to be addressed later" => qr/\bREVIEW\b/ ], [ no_plan => "no_plan" => qr/\bno_plan\b/ ], [ commented => "Commented-out code" => qr/^\s*#.*\$.*=/ ], [ 'return-undef' => "Returning undef" => qr/return\s+undef/ ], [ data => 'Variables called "data"' => qr/[\$\%\@]data\d*\b/ ], [ pathing => 'Up-n-over pathing' => qr/\.\.\// ], [ 'ref-proto' => 'ref proto || proto' => qr/ref\b.+\$\w+.*\|\|.*\$\w+/ ], [ 'mech-ok' => 'Using ok() for mech content instead of $mech->content_like' => qr/^\s*ok.*>content/ ], [ croak => 'Croak, should probably be an assertion' => qr/croak\(/ ], [ dumper => 'Data dumper diagnostics' => qr/warn\s+Dumper/ ], [ ttml => 'Calling TT process on a .tt file (not .ttml)' => qr/process\(.*\.tt[^m]/ ], [ 'array-iterator' => 'using Array::Iterator' => qr/Array::Iterator/ ], [ oci => 'OCI calls' => qr/OCI\w+\(/ ], [ 'if-one-line' => 'if and result on one line' => qr/\b(if|elsif|elseif)\b\s*\(.+\)\s*{.+}/ ], [ 'else-one-line' => 'else and result on one line' => qr/\belse\b\s*{.+}/ ], [ using => 'using CGI.pm' => qr/use CGI[^:]/ ], [ sqldo => 'sqldo using vars, not binds' => qr/sqldo.+=\s*'?\$\w+/ ], ); }


that could prove useful

n1vux on 2005-02-18T21:41:05

I'll have to hack it to have it check what I want checked, but I'm sure I can find a use for this -- Thanks!

admirable discipline

lachoy on 2005-02-18T21:46:08

wow -- there's NO WAY I could name anything 'fluff' with the gutterminds (myself included) at my last job...

Re:admirable discipline

petdance on 2005-02-18T22:15:16

Yeah, I thought about that before posting here. :-)

Outgrowing CGI.pm

bart on 2005-02-19T11:45:31

You've "outgrown CGI.pm"? What does that mean? What's bad about it, what do you recommend to use instead?

Not that I disagree with you, it's just that this apparently is a very controversial stand, in the Perl world. It's considered close to herecy to dis it, apparently.

Re:Outgrowing CGI.pm

petdance on 2005-02-19T15:52:02

We're using Template Toolkit now, not building tags on the fly like we used to.

Re:Outgrowing CGI.pm

perrin on 2005-02-22T18:33:16

Controversial? People diss CGI.pm all the time. It's so slow and bloated that even the author thinks it should be replaced. There are plenty of good alternatives on CPAN, like CGI::Minimal, and mod_perl people typically use Apache::Request instead.

Re:Outgrowing CGI.pm

petdance on 2005-02-22T18:56:03

We're using it mostly for tag building, anyway. For us "using CGI.pm" really means "not using templating."

asdfsdf

statico on 2005-02-21T16:41:48

I just tried it out, neat!

It caught a "return undef" in a comment.

I'd like it to strip whitespace from lines it matches, so we don't see File.pm(123): [lots of indentation] # foo.

VERY handy, since we were doing a lot of these checks by hand. Why didn't it occur to us to write a script? :)