Three things in Perl 6 that aren't so great

masak on 2009-07-22T21:22:15

I keep repeating how great Perl 6 is, because it is. But there are things I wish were different, and that I feel are accidents of history. Maybe they will change before Christmas rolls around, maybe not. But at least to my mind, they stand out as emergent mistakes, good features that combine to make something pretty bad.

Here they are:

  • Methods and Pod
  • Form syntax and string interpolation
  • Comments in the beginning of lines

Below, I'll go through each problem in detail.

Methods and Pod

Perl 6 has two ways of declaring package-like entities such as classes, roles and modules. One way is familiar from Perl 5, and looks like a single statement:

class Austria;

# code here

The other is familiar from some other languages, and looks like a block:

class Austria {
   # code here
}

For the purposes of this blog post, let's call these the statement form and block form, respectively.

The block form is supposed to be the general one, and you can tell from the spec because that's the one that doesn't have any restrictions to it. The statement form is restricted so that it may only occur once in a file, preferably (one might assume) somewhere at the top. In practice, this means that if you have one file with several classes in it, you'll have to use the block form. That's OK, because people like the block form anyway; that's the one they tend to use when no-one tells them not to.

Now, enter Pod. Just like in Perl 5, the Pod directives are written on the leftmost column of the file, no exceptions. This has to do with parsing and stuff; it should be really easy to tell what's a Pod comment and what isn't. But look what happens: people will tend to use the block form for their classes, they will want to document them with Pod, and the result — indented methods and non-indented Pod — will be too hideously ugly for the poor Perl 6 programmer to bear. So they will have to use the more restricted statement form. They will cry a little, because they like the block form. But in the end, they will switch back to the statement form, because the alternative will be too ugly to think about.

There hasn't been much of an uproar about this, because people haven't started Pod-documenting their methods in earnest yet. I did it with the Druid classes, and went through all the above stages: shocked realization, sadness, and then switching to the statement form on a massive scale, at least if they decide to keep their method Pod.

We never had this problem in Perl 5, because Perl 5 doesn't have the block form. We haven't started having the problem in Perl 6, because Pod is in a state of limbo, and people don't really know how to use it to document their methods anyway. But Pod and the block form for declaring classes (and stuff) don't mix well.

Form syntax and string interpolation

E07 was written long ago, in Internet years. Some time after that, the string interpolation changed, and {...} was appropriated to mean "this part of the string isn't part of the string".

So people will be bitten every time they use an interpolating string with the form function. This error will be caught as a syntax error at compile time, so it's not a critical flaw, just bloody annoying.

Matt-W++ is still building Form.pm for Perl 6, so this also hasn't become a real annoyance for people yet. But there's a corresponding thing with eval and interpolating strings that bites people all the time.

It's kinda double-edged: we all like {...} in interpolated strings, but it keeps coming back and biting us too, because we forget it's special syntax.

Comments in the beginning of lines

There was a long bikeshedding discussion about this some years ago. Perl 6 introduces embedded comments, comments which start with a # and then a bracketing character, with all the Unicode smorgasboard to choose from.

However, it was soon realized that people who commented things out by putting #s at the absolute beginning of lines might accidentally create embedded comments by placing their # next to a curly brace or a parenthesis or a bracket. So anything that looks like an embedded comment at the beginning of a line is now treated as a syntax error.

At this point I would like to add that this does not bother me very much. I have come to terms with this particular oddity of the language. But it took me a while, and I can still feel a lingering sense of dissatisfaction with Perl 6 causing me to (*gasp*) change a habit, and one I don't think is a particularly bad one to begin with. Perl is supposed to accomodate a large range of common programming styles, and quickly commenting out some lines by prefixing them with #s is pretty common.

I've now learned to put ## at the beginning of lines I want to comment out instead of just #. This takes care of any unintended embedded comments. So things are fine over here; I'm just worried that a lot of people will feel significant pain when they go through the same process of realization.

I'm not sure embedded comments are all that useful. I seem to find a use for them sometimes in one-liners, but very seldom in other circumstances. Are they really worth the special-casing of a common development technique?

In conclusion

Perl 6 (the spec) is lovely, except in spots. It also isn't finished yet. I like the fact that I can complain like this in a blog post, and smart people will pick my arguments apart, or mull over them and propose improvements for the synopses. All the above three misfeatures are hard to solve because they arise as consequences of features we want, and so fixing the emergent problems would mean going back and changing the features somehow. That's hard.

As usual, feel free to comment. I'm the only one I know who has been severely bitten by the first and the last things, but I'd love to hear if others have too, and how they felt about it.


Methods and Pod not actual problem

duncand on 2009-07-22T23:59:48

There isn't actually a problem in the interaction of methods and pod, when you consider that the "problem" is based on false assumptions.

One false assumption is that methods have to be indented when you have a block class in order to be pretty; I disagree, and methods can be flush left just like their pod, so all pretty so far.

Another false assumption is that you can't have block classes in Perl 5, and so what worked for Perl 5 can't work for Perl 6; in fact you *can* have block classes in Perl 5, and I have been doing so for years in my newer CPAN modules.

Awhile ago I figured out a style that works well for both Perl 5 and Perl 6, that uses both block classes and flush-left methods. See http://cpansearch.perl.org/src/DUNCAND/Muldis-Rosetta-0.13.3/lib/Muldis/Rosetta/ Engine/Example.pm for an example.

You write a block class in Perl 5 like this:

{ package Austria; # class

#####

sub foo { ...
}

#####

} # class Austria

In Perl 6, it is exactly the same format:

class Austria {

#####

method foo (...) { ...
}

#####

} # class Austria

Like Perl 6, in Perl 5, surrounding the package declaration and following code with a brace pair makes it easy and elegant to have multiple package declarations in the same file, which I often exploit.

And if one follows a practice of putting visual lines between routine declarations to make them easier to read by deliniating where routines begin and end, one can also put them before the first and after the last method. This break removes any unpleasantness of the methods not being indented relative to the class block; this also gives us another indent-level of horizontal space to use within the routine.

-- Darren Duncan

Re:Methods and Pod not actual problem

masak on 2009-07-23T09:35:41

There isn't actually a problem in the interaction of methods and pod, when you consider that the "problem" is based on false assumptions.

One false assumption is that methods have to be indented when you have a block class in order to be pretty; I disagree, and methods can be flush left just like their pod, so all pretty so far.

I had this discussion with Tene yesterday after writing the blog post. He said the same thing.

I think I started to value indenting everything many years ago, when I wrote my first medium-sized application (in BASIC) without indentation, and got into a situation when I had to find a missing 'END' somewhere. Everything I've been doing since then has reinforced the idea that consistently indenting things is a really, really good thing.

But you're right: making an exception in this case would solve the whole thing. I guess I'll have to try it out a few times and see if I can make an exception to my hard-line indentation policy.

Another false assumption is that you can't have block classes in Perl 5, and so what worked for Perl 5 can't work for Perl 6; in fact you *can* have block classes in Perl 5, and I have been doing so for years in my newer CPAN modules.

That's good news. I've actually seen this habit in people's Perl 5 code, but haven't thought of it as the block form, even though it effectively is.

Note that, due to the weird rules surrounding the statement form, the same syntax wouldn't work in Perl 6 for more than one class per file. But then again, the block form fills the same purpose, so there's no need to use the statement form for this.

And if one follows a practice of putting visual lines between routine declarations to make them easier to read by deliniating where routines begin and end, one can also put them before the first and after the last method.

De gustibus non est dusputandum, but those visual lines look very intrusive to me. I tend to prefer to use comments to introduce explanations into my code, not visual lines. I see their use for tricking the brain into not being bothered by the lack of indentation, though. In my own code, I think I'll make do without them.

Re:Methods and Pod not actual problem

Ovid on 2009-07-24T07:03:13

One false assumption is that methods have to be indented when you have a block class in order to be pretty; I disagree, and methods can be flush left just like their pod, so all pretty so far.

This is an interesting issue in Perl because now TIMTOWTDI makes a real mess of things. Frankly, I don't want my methods to be flush left because if I scan down the code, I like my indentation to instantly give me a hint of scope. If the method is indented, that gives me information that left justification won't. So let's say I have two classes in one package. The structure can look like this:

class
    method
    method
    method
end
class
    method
    method
    method
end

I can see at a glance we have a second class there. That's very important, but easy to miss with this (particularly when you consider things won't be laid out this cleanly):

class
method
method
method
end
class
method
method
method
end

Re:Methods and Pod not actual problem

masak on 2009-07-24T07:13:19

Thank you, Ovid. You put words to what was only an indistinct feeling for me. I agree fully, I also don't want to sacrifice indentation within classes.

But I also want my Pod. Maybe I'll end up putting the Pod at the end of each file, and then writing some tests to make sure the Pod doesn't drift away from the methods themselves.

Re:Methods and Pod not actual problem

duncand on 2009-07-24T17:22:07

I agree that indenting methods relative to classes does look better, and I even do that myself some times.

Mainly I find that the indenting works best when individual classes are fairly small in the amount of code department, and similarly in those cases I generally don't use the dividing lines I mention.

Where I don't find indenting necessary is when classes are large or there are a number of large methods (ones that fill a screen or more), so that say you've got hundreds to thousands or more of code lines in one class. Generally speaking, indenting is more valuable within a local context than within a much wider context.

The main reason I have methods flush-left within larger classes is that horizontal screen real-estate is valuable and it seems a waste to not use a whole indent level for many thousands of code lines in order for exactly 2 lines to be further left than those. Especially since I strictly limit my line lengths to 75 characters and I use 4 character indents.

My practice has nothing to do with pod. In fact, I think mixing pod with code is a bad idea, and I put all my pod at the end of the file. While I keep my docs for classes and methods in the same order as their corresponding code in general, I believe in writing documentation for the users, and it can be more natural if it is separate; similarly the code can be easier to read.

I write # comments in the few cases where code actually should be explained for code-readers beyond what the user pod says.

So if people want to indent methods, I have no objection save ineffectualness and wasted space.

I will say though, that regardless of whether I indent methods in a class, I always indent other class contents like attribute declarations.

MooseX::Declare has a similar problem with pod

cowens on 2009-07-23T01:30:58

MooseX::Declare adds a block style class to Perl 5 and I have been running into the same POD problem.

Re:MooseX::Declare has a similar problem with pod

masak on 2009-07-23T09:37:08

Ah! I actually suspected as much when writing the post, but I didn't want to complicate the point further by mentioning MooseX::Declare. Thanks for confirming my suspicion.

S02#Whitespace_and_Comments

daxim on 2009-07-23T13:07:50

people who commented things out by putting #s at the absolute beginning of lines might accidentally create embedded comments by placing their # next to a curly brace or a parenthesis or a bracket. [...] I've now learned to put ## at the beginning of lines I want to comment out instead of just #.

Is it too late to spec this in S02 the other way around? Let single # be used for commenting out, no matter what follows. Let ## (perhaps also ### and so on) switch on the special behaviour of brackets etc.

The rationale is:

  1. Using # for commenting out code like in lots of currently used languages inclusive Perl 5 is stays unsurprising.
  2. Enabling some lesser used/esoteric behaviour takes some slightly more effort than the normal case.

Re:S02#Whitespace_and_Comments

masak on 2009-07-23T13:18:32

I kinda like that way of thinking. And no, it's not too late to spec things differently.

However, if you want your proposal to be noticed and possibly acted upon, you really should send an email off to perl6-language. That's where language features and spec changes are discussed.

Re:S02#Whitespace_and_Comments

jmm on 2009-07-23T13:49:16

Hmm, a row of hashes, optionally followed by other text, is often used for a separator, so there is still some potential for getting "special" meaning where none was intended if multi-hash is changed to mean "special" handling.

How about making the special codes a bit harder to get by accident?  Something like #{#  ... #}# - with no whitespace permitted between the braces and the enclosing hashes - for a block delimited comment, perhaps.  That would only be an accidental hazard for people who put a comment in their block header without any whitespace and then try to comment out the line by prepending a hash again without any whitespace.  (My normal practice for commenting out lines is to prepend '# ' - the space after the hash makes it easier to still read the commented out code for when you're decided whether it is time to restore it.  So, I'm not personally liable to get hit by the current spec since I'll comment '{' as '# {' rather than '#{'.)

The root problem is interspersing code with POD

abw on 2009-07-24T06:50:10

I have to say, I find the whole idea of POD interspersed with code quite loathsome. It appreciate all the reasons why it's a Good Thing (which boils down to keeping the docs as close to the code as possible), but it has so many drawbacks.

Most notably, it makes it hard to skim through the code when there's big globs of POD getting in the way. If I'm reading the source then it's code I want to concentrate on. If it's the end-user documentation that I want then I'll use perldoc or look at an HTML version of the docs.

It reminds me of the Bad Old Days when CGI scripts were written as an impenetrable mix of Perl code and HTML markup. We learned our lesson on this a long time ago - a separation of concerns is a good thing.

Another problem is that it forces you to write your POD to match the structure of your code or vice versa. This is wrong. The end-user documentation should be written from the perspective of someone wanting to use the module. This is an entirely separate thing from the way that the module is implemented internally. After all, that's what programming is about - creating "black box" abstractions that hide the details of the internal workings.

IMHO, code should be interspersed with comments to explain how the code works, and the end-user POD documentation should be placed at the end of the file. That way you don't have any of the problems with Perl code getting mixed up with POD. Things are badly wrong when you have to change your coding/indenting styling to fit in with the embedded documentation.

But that's just my HO, of course. Each to his/her own.

Re:The root problem is interspersing code with POD

chromatic on 2009-07-24T18:30:49

Another problem is that it forces you to write your POD to match the structure of your code or vice versa. This is wrong.

This is the most salient criticism of interspersed POD, in my mind. I spend a lot of time thinking about how to organize prose -- especially technical prose -- and the only case where interspersed POD reads naturally is solely API documentation.

Re:The root problem is interspersing code with POD

moritz on 2009-07-25T08:57:38

When I actually wrote a Perl 6 module with POD I found that I actually duplicated the signatures in the POD, because they are just so expressive.

So I nagged p6l over this duplication (and triggered a huge flame war :/ ), and as a result Damian Conway came up with a way to include parts of the code from within the POD.

Why am I telling you this here? Because that mechanism will encourage interleaved POD, because it's easier to reference something close by. So the Problem won't go away anytime soon.

Re:The root problem is interspersing code with POD

cjfields on 2009-07-25T16:49:56

Interesting, considering in PBP Damian suggests (In Documentation, under 'Contiguity') not to intersperse POD throughout code, for similar reasons already stated here.

Re:The root problem is interspersing code with POD

Hercynium on 2009-07-27T18:20:49

I think you've touched upon something that leads to the root of the issue - the fact that there are several different types of documentation that is produced, each with different display needs and different audiences, and possibly different authors.

Like you said, comments are written by the programmer, for other programmers (including oneself) and are intended to communicate knowledge about the nearby code.

On the other hand, POD's purpose is a little more loosely defined. POD is usually written by the programmer, usually for the end-user of the code, usually with the purpose of being displayed separately from the code.

Notice I used the word "usually" a lot there.

What this suggests to me is that POD's purpose is probably being stretched a little too far. POD seems to suffer from the problem of being a documentation jack-of-all-trades - and master of none.

Re:The root problem is interspersing code with POD

vrk on 2009-08-11T09:29:38

Exactly! POD is indeed used for multiple purposes; some people have even authored books written in POD.

However, I find the debate between "POD interspersers" and "POD separatists" boils down to only two usages of POD. The first is to document the API, which is a low-level documentation from one programmer to another. The second is to document the program, which is a high-level documentation from the programmer to the user. Incidentally these two use cases follow closely the preferences of "POD interspersers" and "POD separatists", respectly!

I tend to intersperse POD liberally with my code. For each method or function, I dedicate a third level heading and a description (a paragraph or two, maybe a list or two). These, in turn, are roughly grouped into categories under second level POD headings, if possible. Often it _is_ possible, since Perl is quite flexible in subroutine definition placement.

While many programmers find it hard to scan and read the code in this case, I find it both easier and liberating. Easier, because there is not only the code but also the intent of the programmer right in front of my eyes (assuming the documentation is up to date...); liberating, because I can start writing the API documentation before I have written a single line of code, and then just fill in the gaps between paragraphs. It also enforces a high-level structure that wraps the code module into a narrative one can read from start to end -- or jump into using the table of contents.

However, this is clearly low-level documentation. I often add user documentation at the top of the file in a separate DESCRIPTION or SYNOPSIS section, or just create a separate POD file for those.

Coming back to the raised issues, POD is a valuable tool for many purposes. I don't think it's stretched too far. Part of the reason why it can be used in so varied cases is its simplicity, and also because it is like a traditional comment on steroids: it can be easily extracted. With POD syntax, one can mark which comments communicate high-level information and which do not.

With that in mind, I agree with masak: it should be possible to indent POD blocks with arbitrary whitespace. They may not be as visually separated from the code as they are now, though. Doesn't this indicate a change in syntax is required?