Perl 6, making life easier for tool writers...

pdcawley on 2002-04-04T11:01:28

So, I was thinking about Smalltalk, and Perl 6, and another idea popped up.

Smalltalk has the concept of method categories, which are used to organize methods within the system browser. This means you can easily look at all the 'factory method's implemented by a class, or all the accessor methods or whatever. With large classes this can make a big difference to the class's understandibility when you look at it in the system browser.

So, consider this chunk of Perl6 code.



module Categories is Mixin {
  sub add_categories(@categories) {...}

  # Big, big guess/handwave, but it's not 
  # essential to have. Just nice.
  sub category($name, &block) {
    &block.optree.walk_with -> $op {
      when OP::Sub {
        .attrib(category) //= $name;
      }
      return $op;
    }
  }
}

class SomeClass {
  use Categories;

  add_categories 
    'accessor', 'setter', 
     strangeCategory => "Category description";

  method new is category(factoryMethod) {...}
  
  method attrib is category(accessor) {...}
  method set_attrib is category(setter) {...}

  category strangeCategory {
    method foo {...}
    method bar {...}
  }
}

So, what's the benefit I hear you ask. Well, it moves documentation into the optree. I've realised from the work I've been doing on a refactoring browser that the more documentation you can get at 'programmatically', the easier it is to code useful tools.

For instance, given the category syntax above, it would become possible for a putative to class browser to query a class for its methods and categories, and it could then query those methods to find out which category to display the method in.

And, to my eyes at least, the source code reveals its intent more clearly.

Reinventing Smalltalk

ziggy on 2002-04-04T14:16:50

This is certainly an idea from Smalltalk that has been lost for the ages.

As originally designed in Smalltalk, I think that the category descriptions for methods were done using metadata (comments):

"methods for parsing" .... "methods for error handling" ....

The code browser would respect these comments scattered about the code and use that information to simplify the display in the code browser.

Putting this information into the op tree sorta sounds good, but modifying the syntax of the language (through a user-defined category block) feels like it's taking three steps towards Java just to get one step closer to Smalltalk.

How does this look:

use Categories; ## load behaviors to respect '=category' comments class SomeClass { =category factoryMethod method new {...} =category accessor method attrib {...} =category setter method set_attrib {...} =category strangeCategory method foo {...} method bar {...} }

(To be fair, that code is much smaller because the Categories module isn't stubbed in.)

Personally, I like this better because the metadata for this module is metadata. Presumably the code browser is going to have access to raw source code, and can annotate it's view of the world accordingly. That also means that the optree doesn't need to be decorated during execution, when this information is likely to be meaningless.

Depending on how the parser finally winds up, the Categories module could be quite small -- equating the =categories tag to a self-contained =for statement that doesn't need an empty paragraph to be terminated. The code browser could switch in a different Categories module that actually processes the metadata and annotates the sub definitions. Or perhaps that trivial amount of code is always switched on, but never used outside the code browser...

Re:Reinventing Smalltalk

pdcawley on 2002-04-04T14:54:00
The catch with that approach is that the putative browser has to start parsing the source file to work out what's going on.

If category information is just another attribute on the coderef then the browser doesn't have to start looking in the source to build its browsable structure, it only has to worry about it when you want the method listing.

Of course, if B::Deparse's perl6 equivalent can be relied on to work in all cases, you don't even need to see the source, it can be regenerated from what's in memory.

I'm wondering though if we're going to need to distinguish between the executable optree and some higher level abstract syntax tree. The AST would be far easier for tools/macros/whatever to monkey with but would (usually) get thrown away once you reach runtime...

Re:Reinventing Smalltalk

ziggy on 2002-04-04T15:23:34

The catch with that approach is that the putative browser has to start parsing the source file to work out what's going on.
....then that's a good argument to adopt macros or otherwise tweak the parser to take a =category comment as a way of specifying a category attribute on a sub.
The category {...} syntax you propose looks like it's adding another level of lexical scope, when all it's doing is adding metadata. That sounds like an impedence mismatch to me.
I'm less concerned about how the internals work here than I am about how the idiom looks and feels. Keeping this metadata in a comment -- like Smalltalk did -- feels like a better solution.

Re:Reinventing Smalltalk

pdcawley on 2002-04-04T16:24:21
The 'new scope thing' is a very good point; most of the time you really don't want to do that. And I don't want to have to think about
class foo { ... my category private is private { sub foo {...} sub bar {...} } }
I am convinced. I'm not sure I like your proposed syntax though.

However, quibbling over syntax is beside the point. The point is that Perl6 is going to make it possible for us to do this sort of categorization.

Re:Reinventing Smalltalk

ziggy on 2002-04-04T16:32:02

my category private is private {...}
Are you sure about that? Sounds Clintonian (depends on what the meaning of "private" is...)
:-)

Re:Reinventing Smalltalk

Elian on 2002-04-11T19:16:23
Few things Parrot's doing structurally that may help:

Docstrings as a guaranteed propery on everything. (Potentially empty mind, but guaranteed)

Ties between the bytecode stream and the AST

Ties between the bytecode stream/AST and the original source

So while there's no guarantee that everything'll be there (someone might strip the bytecode) you should be able to look at a sub ref, grab a pointer to its AST, grab a pointer to the beginning of its source, and potentially see any documentation that's been attached to it.
How this stuff is provided for, or represented, at the perl level is left as an exercise for Larry. :)