Class::Sniff Now Detects "Duplicate" Methods

Ovid on 2009-02-02T20:58:58

Yeah, I know, you're sick of hearing about Class::Sniff. I'll be over this manic streak soon :)

It now detects and reports "duplicate" methods via this evil hack:

my $walker = B::Concise::compile( '-terse', $coderef ); 
B::Concise::walk_output( \my $buffer);
$walker->();
$buffer =~ s/^.*//;   # strip method name
$buffer =~ s/\(0x[^)]+\)/(0xHEXNUMBER)/g;   # normalize addresses
my $digest = Digest::MD5::md5_hex($buffer);
$self->{duplicates}{$digest} ||= [];

# later we check if there is more than one entry
push @{ $self->{duplicates}{$digest} } => [$class, $method];

It's severely limited, but I'm hoping it will offer a slight benefit for obviously cut-n-pasted code. We report duplicates based on identical op-trees. If the method names are different or the variable names are different, that's OK. Any change to the op-tree, however, will break this. The following two methods are identical, even if they are in different packages:

sub inc {
    my ( $self, $value ) = @_;
    return $value + 1;
}

sub increment {
    my ( $proto, $number ) = @_;
    return $number + 1;
}

However, this will not match the above methods:

sub increment {
    my ( $proto, $number ) = @_;
    return 1 + $number;
}

Class::Sniff 0.05 should be at your local CPAN mirror soon.

(Now to get around to the Foo::Bar, Foo::Bar::Baz issue that Adam Kennedy pointed out).

Stupid question...

btilly on 2009-02-02T22:13:34

If two packages import the same function from a module, will it think that this is duplicated code in 3 packages, or will it notice that they are the same subroutine so one function happens to appear in 3 places?

If it sees this as duplicated code then I foresee enough false positives to render it useless.

Re:Stupid question...

Ovid on 2009-02-02T22:43:01

It will think this is duplicated code in 3 packages. You would be running this on OO code, generally. Exporting functions into OO packages is a dodgy practice. It can override things you didn't want it to override and can show up as a "method" when it shouldn't. However, that's why I have the human-readable report and these are just "smells". After a while, people will learn to recognize these issues and ignore them.
I might even provide a .sniffrc format so that people can ignore Carp::croak or whatever else they put in there. Right now, I don't want to plan too much as I'd like to get some real-world feedback on what people experience with this. It might very well be useless.

Re:Stupid question...

btilly on 2009-02-03T00:18:51

Well as you say, lots of OO code still imports croak. Also there are people who do mixins. What about those? Sure, you can claim they are dodgy, but when they happen fairly often not by accident.
It seems to me that if you know that A::foo and B::bar have the same opcode, that it wouldn't be too hard to check whether \&A::foo == \&B::foo before reporting on the problem? For simple OO code this is would cut down the false positives. That eliminates the repeated croak issue fairly easily.
Unfortunately this won't cover the case where, for instance, someone does:
for my $field (@fields) {
no strict 'refs';
*{"$field"} = make_getter($field);
*{"set_$field"} = make_setter($field);
}
And now there are lots of methods whose opcodes are identical, but whose references are different. This will be a problem for people using ORMs, or a number of OO frameworks. That's a pretty big limitation. If you can't find any other way around it, though, you can just provide a way to list a set of examples of all sets of generated functions. So in this case you would list one getter and one setter, and it will assume that all functions with those opcodes are generated and should not be reported on.

Re:Stupid question...

Ovid on 2009-02-03T07:53:50

Ironically, lots of getters and setters are also a code smell in OO code, but trying to explain that one to programmers is like trying to nail pudding to a wall. As for mixins, traits and roles, yes, this will be problematic. I don't want special cases, but I think I can handle Class::Trait and Moose::Role, as I think those are the two most common ways of implementing this.
The \&A::foo == \&B::foo is important, but there's an interesting caveat. For a couple of the unreachable methods I discovered yesterday, we had functions exported into a namespace and due to Exporter::NoWork (great module. Wish I hadn't introduced it at work) were getting accidentally re-exported to other classes in a cascading behavior. Finding duplicate functions can help with this. I'll provide facilities for people to ignore certain duplicate functions, but for now, I still think it's useful.

Re: Exporter::NoWork

mauzo on 2009-02-03T14:46:20
Would an option to not re-export subs which were imported from somewhere else help, or are you relying on that behaviour?
I often wish there was a straightforward way to import subs into a lexical scope, so you can use unqualified function names without them showing up as methods. Moose offers no Moose; to do this, but it would be good to have something that hooked into UNITCHECK or some such which would allow
use lexically "Carp"; use lexically "Scalar::Util", qw/weaken/;

(Oh, and yet again I wish Perl 5 had a 'package name' type, so I didn't have to quote those...)

Re: Exporter::NoWork

btilly on 2009-02-03T16:44:54

The straightforward way you were not looking for is:
my $weaken = \&Scalar::Util::weaken; ...
$weaken->($ref);
It looks odd, but I've actually used this technique upon rare occasion and found it worked well.

Re: Exporter::NoWork

phaylon on 2009-02-05T14:20:06

I do this all the time instead:
package Foo; use Moose; # exports a lot implicitly use Scalar::Util qw( blessed ); # this removes everything imported above, but # expcludes the meta method installed by Moose use namespace::clean -except => 'meta';

Re: Exporter::NoWork

Ovid on 2009-02-05T14:26:51

I know you know this, but for others: namespace::clean is great. It removes those methods from the namespaces. Thus:
#!/usr/bin/env perl use strict; use warnings; { package Foo; use Moose; use Scalar::Util 'blessed'; use namespace::clean -except => 'meta'; } use Class::Sniff; my $sniff = Class::Sniff->new({class => 'Foo'}); print $sniff->to_string; print $sniff->report; __END__ +---------------+ | Moose::Object | +---------------+ ^ | | +---------------+ | Foo | +---------------+ Report for class: Foo Overridden Methods .--------+------------------------------------------------------------- -------. | Method | Class | +--------+------------------------------------------------------------------- -+ | meta | Foo | | | Moose::Object | '--------+------------------------------------------------------------------- -'

So 'blessed' won't show up, but meta is exported from Moose and is reported as "overridden" (which it is) and existing in both %Foo:: and %Moose::

Re:Stupid question...

btilly on 2009-02-03T16:56:03

Yes, I am well aware of the theories of OO design which say you don't want getters and setters. However I am also well aware that there are plenty of common approaches to common problems that use getters and setters and work well. Being the kind of person who prefers to worry about practicalities rather than someone's abstract theories, I feel no need to apologize about using getters and setters where it is appropriate.
An example I would offer is for using an object-relational mapper like Rose::DB to make it easy to fetch and retrieve data from a database without having to write SQL. If your design becomes more complicated, you might want to put wrappers on top of that to export cleaner APIs that hide the getters and setters. But having the actual database access mediated by an ORM is perfectly appropriate, and for what the ORM does, creating getters and setters is a very reasonable interface.

Re:Stupid question...

Ovid on 2009-02-03T17:55:49

It's a code smell, not a code error.

I'm not sick of it!

drhyde on 2009-02-03T14:55:15

Yeah, I know, you're sick of hearing about Class::Sniff

I'm not! It's good to hear about useful stuff that people are working on.