Refactoring with Moose Roles

Ovid on 2009-03-16T09:31:48

I recently wrote about alternatives to inheritance. One strategy we're employing from that post is to convert our multiple inheritance to use Moose roles. Consider the following convoluted class hierarchy. Numbers designate "abstract" classes and letters designate "concrete" classes (those class to be used directly).

             +---------+
             |    0    |
             +---------+
            /           \
           /             \
      +---------+   +---------+
      |    1    |   |    2    |
      +---------+   +---------+
      /         \   /         \
     /           \ /           \
+---------+  +---------+  +---------+
|    A    |  |    3    |  |    B    |
+---------+  +---------+  +---------+
            /     |     \
           /      |      \
+---------+  +---------+  +---------+
|    C    |  |    D    |  |    E    |
+---------+  +---------+  +---------+

So we have five concrete classes and three "abstract" classes. Now that's a mess, but from what I've seen, it's not terribly unusual when you get larger systems. Deadlines, unclear design goals or lack of knowledge about multiple inheritance issues can lead to such design choices.

For example, what happens if abstract classes 1 and 2 both provide an update method? Which do you call? Can both of them safely alter the state without interfering with the other? Should you only call one? Should you call both? If you must call both, is the order relevant?

Diamond inheritance is also "fun". What happens if 2 implements save, but 0 also implements it? With Perl's standard "leftmost/depth-first" method resolution, the save method in 0 will be called, not 2. C3 method resolution order can help, but not everyone uses that.

The cognitive overload is probably the worst thing about this. When you call D->save, you know you've inherited from 3, but you also need to know that 3 has inherited from 1 and 2 and each of those has, in turn inherited from 0. To be really safe, you have to root around in all of those classes, understanding the code flow and how everything interacts. What's worse, particularly with Perl, is that generally the public methods are documented, so if you don't actually read the code of the class you're inheriting from, you might silently override a private method. Oops.

That last issue is symptomatic of the real killer. Multiple inheritance often leads to silent bugs. You don't see them until you're trying to figure out why a dependent system is not finding the data it needs.

So let's see what a "refactored" version of this might look like:

                          +---------+
          +---------------|    0    |---------------+
         /                +---------+                \
        /                   /  |  \                   \
       /            +------+   |   +------+            \
      /            /           |           \            \
     /            /            |            \            \
+---------+  +---------+  +---------+  +---------+  +---------+
|  A (1)  |  |  B (2)  |  |C (1,2,3)|  |D (1,2,3)|  |E (1,2,3)|
+---------+  +---------+  +---------+  +---------+  +---------+

In the code above, 1, 2 and 3 have factored out into roles and we have single inheritance with only a single level in the inheritance hierarchy. So you edit "D" and you see this:

package D;

use Moose;
extends 'Base0';
with qw< Role1 Role2 Role3 >;

Right off the bat, you gain a ton of information which multiple inheritance silently hides from you. You see most of the other packages which directly impacts your code. Further, as you might recall, I suggested that 1 and 2 might both implement an update method. With the above code, that will be a compile time failure since Moose can't determine which update method you want. This leads to an interesting "gotcha" with Moose roles:

package D;

use Moose;
extends 'Base0';
with 'Role1';
with 'Role2';
with 'Role3';

You should think long and hard about having separate with statements. If you have one with statement and list all of the roles, method composition detects conflicts. With separate with statements, subsequent role application will not check for these conflicts, eliminating a strong compile-time benefit. However, it does allow you to use roles in a similar fashion to Ruby's "mixins", so this might make your socks roll up and down.

There's also another interesting issue:

package D;

use Moose;
extends 'Base0';
with qw< Role1 Role2 Role3 >;

sub update { ... }

If you already have a method defined which the roles provide, Moose assumes that you know what you're doing and will not add the Role1::update method to your class. This is consistent with Perl 6's behavior. Unfortunately, I think this is a misfeature and can lead to silent failures in your code. However, this is still better than the multiple inheritance situation because you can at least see more easily in your code all of the other classes (or roles) which directly impact what you're doing.

In short, using roles to eliminate multiple inheritance simplifies class hierarchies and makes explicit which code alters the behavior of the class you're writing. By providing all of this information up front rather than burying it several levels deep in parent classes, you reduce the amount of work a developer must do to understand class behavior.

Just because I don't know...

jk2addict on 2009-03-16T14:23:25

We're also assuming that the user of Role1 and Role2 has control over those roles.

What if I want to use Role1 from AuthorA and Role2 from AuthorB, and both roles have an update() method?

How do I resolve that? Even still, at least the problem will be up front.

Re:Just because I don't know...

Ovid on 2009-03-16T14:30:03

If AuthorA and AuthorB are separate classes, just use the correct roles and exclude the methods you don't want. If they're separate instances, use runtime role application:
Role1->meta->apply($author_a) unless $author_a->meta->does_role('Role1'); Role2->meta->apply($author_b) unless $author_b->meta->does_role('Role2');

That's a little ugly, but it's safer than not checking for the use of the role. If you know that the role(s) will only be applied at runtime once and only once, you can skip the does_role check.