MMD and Allomorphism

Ovid on 2006-12-10T19:34:48

When I first read the docs for Class::MultiMethods, I was confused.

# IMPORT THE multimethod DECLARATION SUB...

    use Class::Multimethods;

 # DECLARE VARIOUS MULTIMETHODS CALLED find...

 # 1. DO THIS IF find IS CALLED WITH A Container REF AND A Query REF...

    multimethod find => (Container, Query) 
                     => sub { $_[0]->findquery($_[1]) };

 # 2. DO THIS IF find IS CALLED WITH A Container REF AND A Sample REF...

    multimethod find => (Container, Sample)
                     => sub { $_[0]->findlike($_[1]) };

However, after a bit of time, I realized I was being silly and the syntax is actually not too bad. With how Perl works, that's one of the cleaner ways of describing things. I probably would have found it easier if I had seen it written like this:

multimethod find => (Container, Query) => sub {
    my ( $container, $query ) = @_;
    $container->findquery($query)
};

multimethod find => (Container, Sample) => sub {
    my ( $container, $sample ) = @_;
    $container->findlike($sample)
};

Ah, that's much clearer to me.

Now a more conventional syntax would be cleaner still:

multimethod find(Container $container, Query $query) {
    $container->findquery($query);
};

multimethod find(Container $container, Sample $sample) {
    $container->findlike($sample);
}

Now as it turns out, if you have multiple "multi" subs, you can wind up with complicated dispatching rules which may be fast in C, but in pure Perl, can slow things down (this was proposed for Perl6 and appears different from how Damian's Class::MultiMethods works):

Gather all visible variants with a compatible number of parameters (taking into account the requirements of any "where" constraints)
If there are no such variants, throw a "no such multi" exception
Work out the Manhattan distance from the argument list to each variant's parameter list.
If there is a unique minimum, call that variant
Otherwise, discard every variant whose Manhattan distance isn't minimal
Work out the degree of specialization of each remaining argument list (i.e. the total number of C specializations on the variant's complete set of parameters)
If there is a unique maximum, call that variant
Otherwise, if there is a compatible variant with an <is default> trait, call that variant
Otherwise, throw an "ambiguous call" exception.

Got that? Damian Conway summarizes this as "Unique least-inherited most-specialized match, or default." That's actually not too bad, but I still get a bit twitchy reading it. However, you can even get decent performance out of it you make heavy use of caching.

Regardless of whether you use Manhattan distance or some other technique to determine the correct dispatch of the MultiMethod, there are two problems with this approach, one syntactic, one implementation.

The syntactic problem is this:

multimethod find(Container $container, Query $query) {
    $container->findquery($query);
};

multimethod find(Container $container, Sample $sample) {
    $container->findlike($sample);
}

See the problem? No? OK, let's try again:

multimethod find(Container $container, Query $query) {
    $container->findquery($query);
};

# 300 lines of code later

multimethod find(Container $container, Sample $sample) {
    $container->findlike($sample);
}

Ah ha! Now you can see the problem. The syntax makes it very easy for the programmer to accidentally split up related sections of code and that makes maintenance harder. What I would like to see is something like this:

multimethod find -> 
  (Container $container, Query $query) {
    $container->findquery($query);
  },
  (Container $container, Sample $sample) {
    $container->findlike($sample);
  };

I can't say that this is the exact syntax I would want, but it has the advantage that the programmer is forced to group the overloaded function/methods together.

The other problem is lack of support for allomorphism (classes unrelated by inheritance which still have semantically equivalent sets or subsets of methods). For example, consider this pseudo-code for Perl5:

multi method get_customer (CGI $query) {
    my $customer = $query->param('customer') or croak $some_message;
    return Customer->new($customer);
}

This assumes that the argument is an instance of CGI.pm. But what if you like CGI::Simple? It has the same interface (without the HTML stuff), passes all of the CGI.pm tests, but is lighter and faster. It should work for the above, but it fails because the type is hardcoded when it's the behaviors we're really interested in. Further, because of Perl's poor introspection, it's not possible to know if two methods are semantically equivalent (in other words, do they have the same signatures and return types?)

This raises another interesting problem, though. There's a well-known ambiguous dispatch problem with MMD. Let's say that Cat and Dog inherit from a Mammal class.

multi sub sausage (Cat $cat, Mammal $mammal) {
    # do something
}

multi sub sausage (Mammal $mammal, Dog, $dog) {
    # do something
}

Which gets calls with sausage($cat, $dog);? It's ambiguous and can't get resolved. However, these tend to arise when you have more than one argument to the function or method (excluding the invocant, if any). But what about allomorphism? Imagine a system that let's you supply signatures to fall back on if you don't have the exact class you want:

multi sausage (Mammal $mammal) {
    make_sausage($mammal);
}

# "{ void cry_for_help(void) }" is a signature which means
# we can dispatch to any instance which provides this
multi sausage (AnyThing $thing { void cry_for_help(void) } ) {
    $thing->cry_for_help;   # before being ground
    make_sausage($thing);
}

Now we assume by default that the mammal class does not have a &cry_for_help method. What if we apply a Beg trait to a mammal subclass for a mammal which ordinarily cannot make noise, but the Beg trait supplies the cry_for_help method? Which of the above methods should be called? It might be the first method since that's a mammal, but it might be the second since that allows it to cry for help. Depending upon what each function does, we could argue for either. Should we favor behavior over classes in this case?

Are there simple answers to these problems? I don't know, but I suspect there aren't. Creating an allomorphic MMD system is likely to introduce plenty of bugs in programmers code and while allomorphism solves some complexity problems, it seems to introduce others.

RE: MMD and Allomorphism

Stevan on 2006-12-11T05:01:31

Ah ha! Now you can see the problem. The syntax makes it very easy for the programmer to accidentally split up related sections of code and that makes maintenance harder.

Well, this makes the assumption that the two methods are related in some way other than having the same name. When you think of classes being a collection of methods and attributes (or fields or instance variables or whatever you wanna call them), then the idea of splitting up this multi-method across 300 lines of code is horrid. But if you flip that assumption over (because it is only an assumption, and not a hard and fast rule of all OO systems) and look at methods as being first class things on par with classes, it does not seem to be such a problem.

I think the difference is best illustrated in how a method is called. Do you call the method like this:

$obj->foo(); # send the message foo to the instance $obj

or do you call the method like this:

foo($obj); # dispatch a varient of foo on the instance $obj

They can produce the same results, but the approach is completely different.

Imagine a system that let's you supply signatures to fall back on if you don't have the exact class you want:
multi sausage (AnyThing $thing { void cry_for_help(void) } ) { $thing->cry_for_help; # before being ground make_sausage($thing); }

There is no need to imagine it, you can have it today! Ladies and Gentlemen i give you, ... Ocaml

class anything = object (self) method cry_for_help = print_string ("Help anything\n") method foo = 1 end;; class anywho = object (self) method cry_for_help = print_string ("Help anywho\n") method bar = 1 end;; let sausage x = x#cry_for_help ;; sausage (new anything); sausage (new anywho);

Here I have defined two classes who have the following "class signatures", which are (and this is what the ocaml REPL will print out for you):

class anything : object method cry_for_help : unit method foo : int end class anywho : object method bar : int method cry_for_help : unit end

And I have a function (sausage), who (through the magic of type inference) has been determined to have the following signature

val sausage : < cry_for_help : 'a; .. > -> 'a = <fun>

What this is saying is that sausage must be passed an object (not a class, only an instance) which has a method cry_for_help. When I run this in the OCaml REPL I get this:

Help anything Help anywho

What is basically going on here (as best as I understand it) is called "structural typing". Where instead of just comparing the type's by name (Foo->isa(Foo)) it is comparing them structurally.

And of course my favorite part of OCaml is that I didn't have to actually write down any of this type information at all. The type inferencer just figured it all out for me. It determined that I would be calling the cry_for_help method on the object passed into sausage and built all the type signatures I needed. It then determined that the instance of anything and anywho properly fit that signature, and so it let me compile the code.

Of course, this all comes at a price. OCaml is statically typed, so things like runtime code generation, runtime class creation or even auto-converting a string to an int are pretty much impossible to do. However, you do get blazing fast code (comparable to C/C++ in all the shootouts) and the warm fuzzy feeling that only static typing can bring you.

- Stevan