Redispatching Method Calls with NEXT

jjohn on 2001-12-07T05:12:30

Damian Conway introduces us to the latest creation of his very busy brain, the NEXT class. For those that lament Perl's incomplete handling of multiple inheritence, this class should be the balm for what ails you. Those that would like to get more background information on programming object oriented Perl may want to pick up a copy of Object Oriented Perl by Damian Conway or even take a look a the perltoot manpage.

The NEXT Big Thing

If you've used Perl's object-oriented features, you've probably come across the SUPER pseudo-class. It provides a way of calling an ancestral method from a derived method, without having to explicitly specify which ancestral method it is.

Huh???

Well, suppose we have a class with a dump_info method:

	package Person;
	sub new {
		my ($class, %args) = @_;
		bless { name => $args{name}, age => $args{age} }, $class;
	}
	sub dump_info {
		my ($self) = @_;
		print "Name:\t$self->{name}\n",
		      "Age:\t$self->{age}\n";
	}

When we inherit from that class, our derived class might need to call the ancestral class's constructor (to set up the Person-al bits of the object) and perhaps it will dump extra information as well:

        package Soldier;
        use base 'Person';	# Soldier class inherits from Person class
        sub new {
                my ($class, %args) = @_;
                my $self = $class->Person::new(%args);	# Create object
                $self->{rank}   = $args{rank};		# Add extra info
                $self->{serial} = $args{serial};	# Add extra info
                return $self;
        }
        sub dump_info {
                my ($self) = @_;
                $self->Person::dump_info();		# Dump Person-al info
                print "Rank:\t$self->{rank}\n",		# Dump military info
                      "S/Num:\t$self->{serial}\n";	# Dump military info
        }

Why do we have to explicitly tell the program that the inherited method should also be called (a technique known as "re-dispatch")? Because Perl won't automatically do it for us.

What you see (first) is what you get (only)

Normally, when you call a method on an object:

	$soldier->dump_info();

perl works out which class the object belongs to (e.g. Soldier) and then looks for a correspondingly named subroutine (i.e. &Soldier::dump_info) in the object's class. Since there is such a method in this case, it is immediately called.

However, if there had not been such a method defined, Perl would then have looked at the classes from which Soldier inherits, to see if any of them has a dump_info method that could be called instead. It starts with the first ancestral class (the left-most element in the @Soldier::ISA array), and checks if that class has a dump_info. If not, it tries that class's ancestors (and then that class's ancestors (and then THAT class's ancestors (and...you get the idea.)))

So the search through the object's inheritance tree proceeds left-most, depth-first. That is: at any point in the search, if you don't find a method in the current class, try the complete left ancestral tree first, then the complete ancestral tree to its right, etc. etc.

That process isn't as much work as it sounds because, as soon as the search finds a suitable method anywhere, it immediately ceases looking. That method is then invoked, after which (as far as perl is concerned) the method call is finished.

And that's the problem.

Please, Sir, I want some more.

With a debugging method like dump_info, we don't just want to call the first dump_info that we encounter; we want to call all of them. That way, we get all the information dumped, not just the most-derived information.

That's why we wrote Soldier::dump_info like this:

        sub dump_info {
                my ($self) = @_;
                $self->Person::dump_info();		# Re-dispatch
                print "Rank:\t$self->{rank}\n",
                      "S/Num:\t$self->{serial}\n";
        }

Normally, once this (left-most, depth-first) method had been called, nothing else would happen. But we know that that we need to invoke another method further up the hierarchy as well. So we explicitly "re-dispatch" ourselves upwards to find it.

Note that we had to hard-code the name of the ancestral class in Soldier::dump_info (and in the Soldier::new constructor as well). That's a Bad Idea, because if the name of that base class ever changed, or if we added an interim class between Person and Soldier, we'd have to remember to change that hard-coded ancestor name in every one of Soldier's methods that used it. And there could be dozens of them.

Of course, we could have taken advantage of the fact that the names of a Perl class's ancestors are available via its @ISA array. So we could have written:

        sub new {
                my ($class, %args) = @_;
                my $forebear = "$ISA[0]::new";		# Work out ancestor
                my $self = $class->$forebear(%args);	# Call it
                $self->{rank}   = $args{rank};
                $self->{serial} = $args{serial};
                return $self;
        }
        sub dump_info {
                my ($self) = @_;
                my $forebear = "$ISA[0]::dump_info";	# Work out ancestor
                $self->$forebear();			# Call it
                print "Rank:\t$self->{rank}\n",
                      "S/Num:\t$self->{serial}\n";
        }

Apart from being ugly, this technique is not very reliable. For example, suppose we later wanted Soldier to inherit from two or more classes at once? In that case, the $ISA[0] ancestor tree might not contain the ancestral method we want. It might be in $ISA[1]'s class hierarchy. Or $ISA[2]'s. And there's no way to know until we look.

Perl provides a solution to these problems in the form of a "pretend" class named SUPER. By writing the methods as:

        sub new {
                my ($class, %args) = @_;
                my $self = $class->SUPER::new(%args);
                $self->{rank}   = $args{rank};
                $self->{serial} = $args{serial};
                return $self;
        }
        sub dump_info {
                my ($self) = @_;
                $self->SUPER::dump_info();
                print "Rank:\t$self->{rank}\n",
                      "S/Num:\t$self->{serial}\n";
        }

we tell perl to search through the current class's ancestor list (i.e. $ISA[0], then $ISA[1], then $ISA[2], etc.), find the left-most ancestral class that has the appropriate method, and call that method.

It's just the same as before, except we don't have to hard-code (or even soft-code) the names of the ancestor classes. If the @ISA list changes over time, the call through SUPER will simply search that new inheritance list. The call itself won't ever have to be rewritten.

It's all very handy.

Not so super

Handy, but not perfect.

SUPER has two fatal weaknesses (no, not kryptonite and Lois Lane). There are two serious limitations in the way it searches for and calls inherited methods.

The first limitation is that it will only ever call a single ancestral method, even if two or more ancestors had (say) a dump_info method. Just as in a normal method call, it's always the method inherited from the left-most, depth-first ancestor that is selected. And only that method.

That's probably appropriate when we call SUPER::new, since we'd prefer that just one constructor be called. But it may be a genuine nuisance when we call SUPER::dump_info, since all of a Soldier's ancestral classes will probably have information that ought to be dumped.

Perhaps you're thinking that we should just add a second call to SUPER::dump_info inside Person::dump_info -- to re-re-dispatch the method call to yet another class.

That's certainly the right idea but, unfortunately, it brings us immediately to the second fatal flaw: SUPER only looks up the inheritance tree at any point. Calling it again in the Person class will restart the search for another dump_info, but only amongst Person's ancestors. It will never backtrack down the inheritance tree to try any other ancestors of the original Soldier class.

So all we can hope for is to call a sequence of "left-most" inherited dump_info methods, ignoring any other similar methods in any other branches of the inheritance tree. That's only a partial solution, at best.

Meanwhile, on Planet AUTOLOAD...

Curiously, that same problem can crop up in an nearly-unrelated context: the way an AUTOLOAD method handles failures.

Normally, when a class has an AUTOLOAD, that method is invoked if the class (and all its ancestors) don't have a suitable method. For example:

        package Soldier;
        sub AUTOLOAD {
                if ($AUTOLOAD =~ /::(march|salute|train)$/) {
                        print "Sir, yes, sir!\n";
                }
                die "Unknown method called on Soldier: $AUTOLOAD";
        }

This allows the class to intercept and handle calls to the undefined methods march, salute, and train, but still throw an exception when other undefined methods are called.

That can be particularly useful for prototyping, since we can use a single AUTOLOAD to act as a "stub" for dozens of new methods we haven't gotten around to actually implementing yet.

However, there's a problem here too. What if one of Soldier's ancestor classes had an AUTOLOAD that could handle the (undefined and unhandled) eat, sleep, and breathe methods? It would never get the chance to do so, because Soldier::AUTOLOAD would intercept the method call before it reached that ancestral AUTOLOAD.

To overcome that, people often write:

        sub AUTOLOAD {
                if ($AUTOLOAD =~ /::(march|salute|train)$/) {
                        print "Sir, yes, sir\b";
                }
                else {
                        shift->SUPER::AUTOLOAD(@_);
                }
        }

If the if can't handle the requested method, we let the else shift off the object reference and call an ancestral AUTOLOAD on it, passing the remaining arguments in @_. That gives the left-most ancestral AUTOLOAD a chance to deal with a missing method if the current AUTOLOAD can't.

But what if it's the right-most ancestral AUTOLOAD that can handle the missing method? It won't ever get a chance to do so, because the left-most AUTOLOAD will be invoked instead. And, even if that left-most AUTOLOAD does its own:

        shift->SUPER::AUTOLOAD(@_);

the chain of re-dispatches will only ever proceed upwards, never backtracking to give the right-most AUTOLOAD a chance.

Next train stops all stations

What we need is the ability to re-dispatch a method (or an AUTOLOAD) in such a way that, rather than trying again with just the current class's ancestors, the re-dispatch restarts the original call (i.e. the one that got us to the current method).

By restarting the original call, we'd allow the search to backtrack down the inheritance tree if it needed to. And that would solve all our problems at once. For example, if the AUTOLOAD methods in Soldier's hierarchy were:

        package Person;

        sub AUTOLOAD {
                if ($AUTOLOAD =~ /::(eat|sleep)$/) {
                        print "Sir, yes, sir\b";
                }
		else {
			# somehow restart original method search here
		}
        }
        package Respirant;
        sub AUTOLOAD {
                if ($AUTOLOAD =~ /::(breathe)$/) {
                        print "Sir, yes, sir\b";
                }
		else {
			# somehow restart original method search here
		}
        }
        package Soldier;
        use base 'Person', 'Respirant';
        sub AUTOLOAD {
                if ($AUTOLOAD =~ /::(march|salute|train)$/) {
                        print "Sir, yes, sir\b";
                }
		else {
			# somehow restart original method search here
		}
        }

then a request to breathe() would first find Soldier::AUTOLOAD, which would then restart the original search and find Person::AUTOLOAD, which would restart the original search again and backtrack to find Respirant::AUTOLOAD, which would finally handle the breathe() call.

Likewise if the various classes all had dump_info methods:

        sub Person::dump_info {
                my ($self) = @_;
                # somehow restart original method search here
                print "Name:\t$self->{name}\n",
                      "Age:\t$self->{age}\n";
        }
        sub Respirant::dump_info {
                my ($self) = @_;
                # somehow restart original method search here
                print "L/Cap:\t$self->{lung_capacity}\n";
        }
        sub Soldier::dump_info {
                my ($self) = @_;
                # somehow restart original method search here
                print "Rank:\t$self->{rank}\n",
                      "S/Num:\t$self->{serial}\n";
        }

then calling dump_info on a soldier object would eventually invoke each inherited dump_info as well (including any others that Person or Respirant might have inherited themselves).

The only question is: how can we restart an original method search? By the time we're in a method, that search is over.

Better luck next time.

That's where the NEXT pseudo-class come in. NEXT is used just like SUPER:

        use NEXT;
        # and later (inside some method)...
        shift->NEXT::method_name(@_);

But, instead of beginning a new method look-up amongst the class's ancestors, it resumes the original method look-up, by-passing the existing method to find the next most appropriate one.

So, to solve the problem of finding the correct AUTOLOAD in a hierarchy, we simply ensure that each AUTOLOAD re-dispatches via NEXT if it can't handle the call itself:

        sub AUTOLOAD {
                if ($AUTOLOAD =~ /::(march|salute|train)$/) {
                        print "Sir, yes, sir\b";
                }
                else {
                        shift->NEXT::AUTOLOAD(@_);
                }
        }

The same approach solves the problem of ensuring that every ancestral dump_info is properly invoked. If each method in each class is structured as:

        sub dump_info {
                my ($self) = @_;
                $self->NEXT::dump_info();
                print "This Class's Info:\t$self->{this_class_info}\n";
        }

then the chain of restarts will ensure that every dump_info throughout the entire tree is called (with ancestors being called before descendants). Alternatively, we could write:

        sub dump_info {
                my ($self) = @_;
                print "This Class's Info:\t$self->{this_class_info}\n";
                $self->NEXT::dump_info();
        }

and have descendant class information dumped before its ancestors.

Probably the single most useful application for this ability is to ensure that all an object's destructors are called appropriately:

        package Person;
        sub DESTROY {
                # Clean up Personalized bits of object
                # and then...
                shift->NEXT::DESTROY();
        }
        package Respirant;
        sub DESTROY {
                # Clean up Respiratory bits of object
                # and then...
                shift->NEXT::DESTROY();
        }
        package Soldier;
        use base 'Person', 'Respirant';
        sub DESTROY {
                # Clean up Military bits of object
                # and then...
                shift->NEXT::DESTROY();
        }

This ensures that all three destructors available to a Soldier object will be called when the object ceases to exist.

Without that re-dispatching, only Soldier::DESTROY would be invoked, leaving the ancestral bits of the object undestructed. And that could be a serious problem.

Explicit destructors are rarely needed in Perl; garbage collection takes care of most things automagically. So if someone went to the trouble of giving a class a destructor, it's almost certainly very important that the destructor actually be called. That doesn't happen in standard Perl; destructors are regular methods, and only the left-most, depth-first method is ever called.

But, by chaining the destructors together using NEXT, we ensure that every destructor that should be invoked is invoked.

Next to nothing

There's still one problem with having every method re-dispatch itself that way: eventually we'll have invoked every available method in every ancestral class. So when the very last method in the very last class also re-dispatches itself, there will be no "next" method to find. And quicker than you can say:

        Can't locate object method "dump_info" via package "Soldier"

we'll have thrown an exception.

Except of course, we won't. Unlike the SUPER pseudo-class, which does die horribly in this way when we run out of ancestral methods to call, NEXT doesn't throw an exception if it fails to find a next method to call. Because it's typically used to traverse an entire hierarchy, once it's completed that traversal, it simply stops looking and quietly returns.

That means we can happily add a:

        $self->NEXT::dump_info();

at the end of every dump_info method, just in case there might be another dump_info in some other class somewhere. If there is, it will be called; if there isn't, the re-dispatch does nothing.

Which is fine...until we want to use NEXT to re-dispatch AUTOLOAD.

Next to impossible

Re-dispatching AUTOLOAD is a little different. We usually do so because the current AUTOLOAD can't handle the original call. In which case, rather than looking for extra ways of handling a particular call, we're looking for the One True Way to handle it.

What if we don't find it?

Well, because NEXT fails quietly, a call like:

        $soldier->entrechat();		# Army Corps de Ballet???

will work its way through the various ancestral AUTOLOADs, fail to find any that can handle this particular manoeuvre, and quietly do nothing. Instead of throwing an exception, like it should.

A fellow Aussie, Paul Fenwick (who maintains the Finance::Quote module, when he's not busy running Perl Training Australia), first pointed this out to me. He also suggested a simple solution that I adopted in the most recent release of the NEXT module.

Paul's idea was to provide a second pseudo-class (which I named NEXT::ACTUAL) that acts exactly like NEXT, except that it throws an exception on failure. So now, if we write:

        $self->NEXT::ACTUAL::AUTOLOAD();

and there isn't an actual next AUTOLOAD to call, we get an exception instead.

This new pseudo-class is probably only useful in AUTOLOADs, but it can be used to "strictly" re-dispatch any method.

Next time around

Paul also pointed out another situation in which NEXT's normal behaviour may not be desirable. By default, a series of NEXT invocations walks an object's entire inheritance hierarchy and calls every appropriately named method it encounters.

That can be a problem in a so-called "diamond inheritance" hierarchy, in which a derived class inherits a single ancestor class through two or more paths:

                 Person
                 ^^   ^
                /  \   \
           Worker   \  Thinker
              ^      \   ^
             /        \ /
         Soldier     Leader
             ^        ^
              \      /
              Commander

Under this kind of arrangement, a method defined in an "apex" class (like Person) can be called twice or more -- once each time NEXT's hierarchy traversal visits it.

Occasionally that might be appropriate, but often it's not. For example, we might be NEXTing through the destructors inherited by an object. Those methods will very probably be freeing up resources the object had acquired during its lifetime. If Person has such a destructor:

        sub Person::DESTROY {
                my ($self) = @_;
                seek $self->{changelog}, 0, 0;
                truncate $self->{changelog}, $self->{maxloglen};
                close $self->{changelog};
                $self->NEXT::DESTROY();
        }

then a Commander object would call that destructor three times (once through Worker, once though Leader, and once through Thinker). That would cause Very Bad Things to happen when the second call to Person::DESTROY attempted to manipulate the $self->{changelog} filehandle -- after the first call to Person::DESTROY had already closed it.

So NEXT provides yet another pseudo-class -- NEXT::UNSEEN -- which re-dispatches to the next appropriate method that hasn't already be re-dispatched to. So, if the Commander hierarchy used:

        $self->NEXT::UNSEEN::DESTROY();

throughout, then each inherited destructor would be called only once. Person::DESTROY would be called the first time (through Worker), but the second and third time it's encountered (though Leader and then Thinker) it will be skipped, since it has already been seen.

Oh, and yes, we can have both the "one ping only" behaviour of NEXT::UNSEEN and the "exception-on-failure" of NEXT::ACTUAL at the same time. Just use:

        $self->NEXT::UNSEEN::ACTUAL::method_name();

or:

        $self->NEXT::ACTUAL::UNSEEN::method_name();

What's next?

The NEXT module is now very stable and quite usable in production code. It will ship as part of the core distribution of perl 5.8.

Only two further enhancements are currently planned. The first is to provide a mechanism to allow an object's inheritance hierarchy to be traversed breadth-first (and maybe in other sequences too), rather than depth-first.

Breadth-first traversal is especially important for re-dispatching destructors, as it ensures that base-class destructors are not invoked until all the derived bits of an object have been properly cleaned up. That's vital because the derived bits may well be relying on the state of the base bits in some way.

Working out the correct breadth-first sequence is non-trivial in the general case. For example, consider the subtleties of determining the correct "breadth-first" order of the following (pathological, but legal) inheritance hierarchy:

                 Person
                 ^  ^ ^
                /   |  \
            Worker--|->Thinker
              ^  \  |    ^
             /    \ |   /
            /      v|  /
        Soldier-->Leader
             ^       ^
              \     /
              Commander

The second enhancement will be to integrate the NEXT module with the Class::Multimethods module and its forthcoming successor, Attributes::Multimethod. This will make it possible to re-dispatch multiply dispatched methods as well.

With those two additions Perl will then have one of the most flexible and powerful dispatch mechanisms of any programming language.

Just as it should.