Damian Conway introduces us to the latest creation of his very busy brain, the NEXT
class. For those that lament Perl's incomplete handling of multiple inheritence, this class should be the balm for what ails you. Those that would like to get more background information on programming object oriented Perl may want to pick up a copy of Object Oriented Perl by Damian Conway or even take a look a the perltoot manpage.
NEXT
Big ThingIf you've used Perl's object-oriented features, you've probably come
across the SUPER
pseudo-class. It provides a way of calling
an ancestral method from a derived method, without having to
explicitly specify which ancestral method it is.
Huh???
Well, suppose we have a class with a dump_info
method:
package Person;
sub new { my ($class, %args) = @_; bless { name => $args{name}, age => $args{age} }, $class; }
sub dump_info { my ($self) = @_; print "Name:\t$self->{name}\n", "Age:\t$self->{age}\n"; }
When we inherit from that class, our derived class might need to call the ancestral class's constructor (to set up the Person-al bits of the object) and perhaps it will dump extra information as well:
package Soldier; use base 'Person'; # Soldier class inherits from Person class
sub new { my ($class, %args) = @_; my $self = $class->Person::new(%args); # Create object $self->{rank} = $args{rank}; # Add extra info $self->{serial} = $args{serial}; # Add extra info return $self; }
sub dump_info { my ($self) = @_; $self->Person::dump_info(); # Dump Person-al info print "Rank:\t$self->{rank}\n", # Dump military info "S/Num:\t$self->{serial}\n"; # Dump military info }
Why do we have to explicitly tell the program that the inherited method should also be called (a technique known as "re-dispatch")? Because Perl won't automatically do it for us.
Normally, when you call a method on an object:
$soldier->dump_info();
perl works out which class the object belongs to (e.g. Soldier
) and then
looks for a correspondingly named subroutine (i.e. &Soldier::dump_info
)
in the object's class. Since there is such a method in this case, it
is immediately called.
However, if there had not been such a method defined, Perl would then
have looked at the classes from which Soldier
inherits, to see if any
of them has a dump_info
method that could be called instead. It
starts with the first ancestral class (the left-most element in the
@Soldier::ISA
array), and checks if that class has a dump_info
.
If not, it tries that class's ancestors (and then that class's ancestors
(and then THAT class's ancestors (and...you get the idea.)))
So the search through the object's inheritance tree proceeds left-most, depth-first. That is: at any point in the search, if you don't find a method in the current class, try the complete left ancestral tree first, then the complete ancestral tree to its right, etc. etc.
That process isn't as much work as it sounds because, as soon as the search finds a suitable method anywhere, it immediately ceases looking. That method is then invoked, after which (as far as perl is concerned) the method call is finished.
And that's the problem.
With a debugging method like dump_info
, we don't just want to call
the first dump_info
that we encounter; we want to call all of them. That
way, we get all the information dumped, not just the most-derived
information.
That's why we wrote Soldier::dump_info
like this:
sub dump_info { my ($self) = @_; $self->Person::dump_info(); # Re-dispatch print "Rank:\t$self->{rank}\n", "S/Num:\t$self->{serial}\n"; }
Normally, once this (left-most, depth-first) method had been called, nothing else would happen. But we know that that we need to invoke another method further up the hierarchy as well. So we explicitly "re-dispatch" ourselves upwards to find it.
Note that we had to hard-code the name of the ancestral class in
Soldier::dump_info
(and in the Soldier::new
constructor as well).
That's a Bad Idea, because if the name of that base class ever changed,
or if we added an interim class between Person
and Soldier
, we'd
have to remember to change that hard-coded ancestor name in every one of
Soldier
's methods that used it. And there could be dozens of them.
Of course, we could have taken advantage of the fact that the names of a Perl
class's ancestors are available via its @ISA
array. So we could
have written:
sub new { my ($class, %args) = @_; my $forebear = "$ISA[0]::new"; # Work out ancestor my $self = $class->$forebear(%args); # Call it $self->{rank} = $args{rank}; $self->{serial} = $args{serial}; return $self; }
sub dump_info { my ($self) = @_; my $forebear = "$ISA[0]::dump_info"; # Work out ancestor $self->$forebear(); # Call it print "Rank:\t$self->{rank}\n", "S/Num:\t$self->{serial}\n"; }
Apart from being ugly, this technique is not very reliable. For example,
suppose we later wanted Soldier
to inherit from two or more classes
at once? In that case, the $ISA[0]
ancestor tree might not contain
the ancestral method we want. It might be in $ISA[1]
's class
hierarchy. Or $ISA[2]
's. And there's no way to know until we look.
Perl provides a solution to these problems in the form of a "pretend" class
named SUPER
. By writing the methods as:
sub new { my ($class, %args) = @_; my $self = $class->SUPER::new(%args); $self->{rank} = $args{rank}; $self->{serial} = $args{serial}; return $self; }
sub dump_info { my ($self) = @_; $self->SUPER::dump_info(); print "Rank:\t$self->{rank}\n", "S/Num:\t$self->{serial}\n"; }
we tell perl to search through the current class's ancestor list (i.e.
$ISA[0]
, then $ISA[1]
, then $ISA[2]
, etc.), find the left-most
ancestral class that has the appropriate method, and call that method.
It's just the same as before, except we don't have to hard-code (or even
soft-code) the names of the ancestor classes. If the @ISA
list
changes over time, the call through SUPER
will simply search that new
inheritance list. The call itself won't ever have to be rewritten.
It's all very handy.
Handy, but not perfect.
SUPER
has two fatal weaknesses (no, not kryptonite and Lois Lane).
There are two serious limitations in the way it searches for and
calls inherited methods.
The first limitation is that it will only ever call a single ancestral method,
even if two or more ancestors had (say) a dump_info
method. Just as in a
normal method call, it's always the method inherited from the left-most,
depth-first ancestor that is selected. And only that method.
That's probably appropriate when we call SUPER::new
, since we'd prefer that
just one constructor be called. But it may be a genuine nuisance when we
call SUPER::dump_info
, since all of a Soldier
's
ancestral classes will probably have information that ought to be dumped.
Perhaps you're thinking that we should just add a second call to
SUPER::dump_info
inside Person::dump_info
-- to re-re-dispatch the
method call to yet another class.
That's certainly the right idea but, unfortunately, it brings us
immediately to the second fatal flaw: SUPER
only looks up the
inheritance tree at any point. Calling it again in the Person
class
will restart the search for another dump_info
, but only amongst
Person
's ancestors. It will never backtrack down the inheritance tree
to try any other ancestors of the original Soldier
class.
So all we can hope for is to call a sequence of "left-most"
inherited dump_info
methods, ignoring any other similar methods
in any other branches of the inheritance tree.
That's only a partial solution, at best.
Curiously, that same problem can crop up in an nearly-unrelated context:
the way an AUTOLOAD
method handles failures.
Normally, when a class has an AUTOLOAD
, that method is invoked if the
class (and all its ancestors) don't have a suitable method. For example:
package Soldier;
sub AUTOLOAD { if ($AUTOLOAD =~/::(march|salute|train)$/) { print "Sir, yes, sir!\n"; } die "Unknown method called on Soldier: $AUTOLOAD"; }
This allows the class to intercept and handle calls to the
undefined methods march
, salute
, and train
, but still
throw an exception when other undefined methods are called.
That can be particularly useful for prototyping, since we can use a single
AUTOLOAD
to act as a "stub" for dozens of new methods we haven't gotten
around to actually implementing yet.
However, there's a problem here too. What if one of Soldier
's ancestor
classes had an AUTOLOAD
that could handle the (undefined and unhandled)
eat
, sleep
, and breathe
methods? It would never get the
chance to do so, because Soldier::AUTOLOAD
would intercept the method
call before it reached that ancestral AUTOLOAD
.
To overcome that, people often write:
sub AUTOLOAD { if ($AUTOLOAD =~/::(march|salute|train)$/) { print "Sir, yes, sir\b"; } else { shift->SUPER::AUTOLOAD(@_); } }
If the if
can't handle the requested method, we let the else
shift off the object reference and call an ancestral AUTOLOAD
on it, passing
the remaining arguments in @_
. That gives the left-most ancestral
AUTOLOAD
a chance to deal with a missing method if the current
AUTOLOAD
can't.
But what if it's the right-most ancestral AUTOLOAD
that can
handle the missing method? It won't ever get a chance to do so, because the
left-most AUTOLOAD
will be invoked instead. And, even if that
left-most AUTOLOAD
does its own:
shift->SUPER::AUTOLOAD(@_);
the chain of re-dispatches will only ever proceed upwards,
never backtracking to give the right-most AUTOLOAD
a chance.
What we need is the ability to re-dispatch a method (or an AUTOLOAD
) in
such a way that, rather than trying again with just the current class's
ancestors, the re-dispatch restarts the original call (i.e. the one that
got us to the current method).
By restarting the original call, we'd allow the search to backtrack down
the inheritance tree if it needed to. And that would solve all our
problems at once. For example, if the AUTOLOAD
methods in
Soldier
's hierarchy were:
package Person; sub AUTOLOAD { if ($AUTOLOAD =~/::(eat|sleep)$/) { print "Sir, yes, sir\b"; } else { # somehow restart original method search here } }
package Respirant;
sub AUTOLOAD { if ($AUTOLOAD =~/::(breathe)$/) { print "Sir, yes, sir\b"; } else { # somehow restart original method search here } }
package Soldier; use base 'Person', 'Respirant';
sub AUTOLOAD { if ($AUTOLOAD =~/::(march|salute|train)$/) { print "Sir, yes, sir\b"; } else { # somehow restart original method search here } }
then a request to breathe()
would first find Soldier::AUTOLOAD
, which would then
restart the original search and find Person::AUTOLOAD
, which would restart the
original search again and backtrack to find Respirant::AUTOLOAD
,
which would finally handle the breathe()
call.
Likewise if the various classes all had dump_info
methods:
sub Person::dump_info { my ($self) = @_; # somehow restart original method search here print "Name:\t$self->{name}\n", "Age:\t$self->{age}\n"; }
sub Respirant::dump_info { my ($self) = @_; # somehow restart original method search here print "L/Cap:\t$self->{lung_capacity}\n"; }
sub Soldier::dump_info { my ($self) = @_; # somehow restart original method search here print "Rank:\t$self->{rank}\n", "S/Num:\t$self->{serial}\n"; }
then calling dump_info
on a soldier
object would eventually invoke
each inherited dump_info
as well (including any others that Person
or
Respirant
might have inherited themselves).
The only question is: how can we restart an original method search? By the time we're in a method, that search is over.
That's where the NEXT
pseudo-class come in. NEXT
is used
just like SUPER
:
use NEXT;
# and later (inside some method)...
shift->NEXT::method_name(@_);
But, instead of beginning a new method look-up amongst the class's ancestors, it resumes the original method look-up, by-passing the existing method to find the next most appropriate one.
So, to solve the problem of finding the correct AUTOLOAD
in a hierarchy,
we simply ensure that each AUTOLOAD
re-dispatches via NEXT
if it can't handle the call itself:
sub AUTOLOAD { if ($AUTOLOAD =~/::(march|salute|train)$/) { print "Sir, yes, sir\b"; } else { shift->NEXT::AUTOLOAD(@_); } }
The same approach solves the problem of ensuring that every ancestral
dump_info
is properly invoked. If each method in each class is
structured as:
sub dump_info { my ($self) = @_; $self->NEXT::dump_info(); print "This Class's Info:\t$self->{this_class_info}\n"; }
then the chain of restarts will ensure that every dump_info
throughout the entire tree is called (with ancestors being called before
descendants). Alternatively, we could write:
sub dump_info { my ($self) = @_; print "This Class's Info:\t$self->{this_class_info}\n"; $self->NEXT::dump_info(); }
and have descendant class information dumped before its ancestors.
Probably the single most useful application for this ability is to ensure that all an object's destructors are called appropriately:
package Person;
sub DESTROY { # Clean up Personalized bits of object # and then... shift->NEXT::DESTROY(); }
package Respirant;
sub DESTROY { # Clean up Respiratory bits of object # and then... shift->NEXT::DESTROY(); }
package Soldier; use base 'Person', 'Respirant';
sub DESTROY { # Clean up Military bits of object # and then... shift->NEXT::DESTROY(); }
This ensures that all three destructors available to a Soldier
object
will be called when the object ceases to exist.
Without that re-dispatching, only Soldier::DESTROY
would be invoked,
leaving the ancestral bits of the object undestructed. And that could be a
serious problem.
Explicit destructors are rarely needed in Perl; garbage collection takes care of most things automagically. So if someone went to the trouble of giving a class a destructor, it's almost certainly very important that the destructor actually be called. That doesn't happen in standard Perl; destructors are regular methods, and only the left-most, depth-first method is ever called.
But, by chaining the destructors together using NEXT
, we ensure that
every destructor that should be invoked is invoked.
There's still one problem with having every method re-dispatch itself that way: eventually we'll have invoked every available method in every ancestral class. So when the very last method in the very last class also re-dispatches itself, there will be no "next" method to find. And quicker than you can say:
Can't locate object method "dump_info" via package "Soldier"
we'll have thrown an exception.
Except of course, we won't. Unlike the SUPER
pseudo-class, which does die horribly in this way when we run out of ancestral methods to call, NEXT
doesn't throw an exception if it fails to find a next method to call.
Because it's typically used to traverse an entire hierarchy,
once it's completed that traversal, it simply stops looking
and quietly returns.
That means we can happily add a:
$self->NEXT::dump_info();
at the end of every dump_info
method, just in case there might be another
dump_info
in some other class somewhere. If there is, it will be called; if
there isn't, the re-dispatch does nothing.
Which is fine...until we want to use NEXT
to re-dispatch AUTOLOAD
.
Re-dispatching AUTOLOAD
is a little different. We usually do so
because the current AUTOLOAD
can't handle the original call. In which case, rather than looking for extra ways of handling a
particular call, we're looking for the One True Way to handle it.
What if we don't find it?
Well, because NEXT
fails quietly, a call like:
$soldier->entrechat(); # Army Corps de Ballet???
will work its way through the various ancestral AUTOLOAD
s, fail to
find any that can handle this particular manoeuvre, and quietly do nothing.
Instead of throwing an exception, like it should.
A fellow Aussie, Paul Fenwick (who maintains the Finance::Quote module, when he's not busy running Perl Training Australia), first pointed this out to me. He also suggested a simple solution that I adopted in the most recent release of the NEXT module.
Paul's idea was to provide a second pseudo-class (which I named
NEXT::ACTUAL
) that acts exactly like NEXT
, except that it throws
an exception on failure. So now, if we write:
$self->NEXT::ACTUAL::AUTOLOAD();
and there isn't an actual next AUTOLOAD
to call,
we get an exception instead.
This new pseudo-class is probably only useful in AUTOLOAD
s, but it can
be used to "strictly" re-dispatch any method.
Paul also pointed out another situation in which NEXT
's normal behaviour
may not be desirable. By default, a series of NEXT
invocations walks an
object's entire inheritance hierarchy and calls every appropriately named
method it encounters.
That can be a problem in a so-called "diamond inheritance" hierarchy, in which a derived class inherits a single ancestor class through two or more paths:
Person ^^ ^ / \ \ Worker \ Thinker ^ \ ^ / \ / Soldier Leader ^ ^ \ / Commander
Under this kind of arrangement, a method defined in an "apex" class
(like Person
) can be called twice or more -- once each time
NEXT
's hierarchy traversal visits it.
Occasionally that might be appropriate, but often it's not. For example,
we might be NEXT
ing through the destructors inherited by an object.
Those methods will very probably be freeing up resources the object had
acquired during its lifetime. If Person
has such a destructor:
sub Person::DESTROY { my ($self) = @_; seek $self->{changelog}, 0, 0; truncate $self->{changelog}, $self->{maxloglen}; close $self->{changelog}; $self->NEXT::DESTROY(); }
then a Commander
object would call that destructor three times (once
through Worker
, once though Leader
, and once through Thinker
).
That would cause Very Bad Things to happen when the second call to
Person::DESTROY
attempted to manipulate the $self->{changelog}
filehandle -- after the first call to Person::DESTROY
had already
closed it.
So NEXT
provides yet another pseudo-class -- NEXT::UNSEEN
-- which
re-dispatches to the next appropriate method that hasn't already be
re-dispatched to. So, if the Commander
hierarchy used:
$self->NEXT::UNSEEN::DESTROY();
throughout, then each inherited destructor would be called only once.
Person::DESTROY
would be called the first time (through Worker
),
but the second and third time it's encountered (though Leader
and then
Thinker
) it will be skipped, since it has already been seen.
Oh, and yes, we can have both the "one ping only" behaviour of
NEXT::UNSEEN
and the "exception-on-failure" of NEXT::ACTUAL
at the same time. Just use:
$self->NEXT::UNSEEN::ACTUAL::method_name();
or:
$self->NEXT::ACTUAL::UNSEEN::method_name();
The NEXT
module is now very stable and quite usable in production code.
It will ship as part of the core distribution of perl 5.8.
Only two further enhancements are currently planned. The first is to provide a mechanism to allow an object's inheritance hierarchy to be traversed breadth-first (and maybe in other sequences too), rather than depth-first.
Breadth-first traversal is especially important for re-dispatching destructors, as it ensures that base-class destructors are not invoked until all the derived bits of an object have been properly cleaned up. That's vital because the derived bits may well be relying on the state of the base bits in some way.
Working out the correct breadth-first sequence is non-trivial in the general case. For example, consider the subtleties of determining the correct "breadth-first" order of the following (pathological, but legal) inheritance hierarchy:
Person ^ ^ ^ / | \ Worker--|->Thinker ^ \ | ^ / \ | / / v| / Soldier-->Leader ^ ^ \ / Commander
The second enhancement will be to integrate the NEXT
module with the
Class::Multimethods
module and its forthcoming successor,
Attributes::Multimethod
. This will make it possible to re-dispatch
multiply dispatched methods as well.
With those two additions Perl will then have one of the most flexible and powerful dispatch mechanisms of any programming language.
Just as it should.