GIVE ME Plain Old Documentation OR I'LL KILL YOUR MOTHER!!!

mugwumpjism on 2005-03-18T00:59:39

Ask half a dozen programmers to make a decision on a Wiki markup, then be prepared to wipe the blood off the floor.

Back in the good old days, you couldn't really afford to make bad design decisions, the computers took so damned long to do anything, and the languages were so slow to implement anything in, that bad design decisions would prove fatal. That is why the designers of SGML, and Tex, chose formats that might not look the prettiest - but were always guaranteed to be easy to parse.

Then along came the internet, and the fantastic modern wonder that is forms-based programming. Good old FORMS, eh? The peculiar characteristics of the rancid TEXTAREA tag started a whole movement of people who thought it would be a good idea to try to shoehorn structured document markup into these tiny, monospaced spaces.

And so, the Wiki was born. Anyone who's ever tried to use a Wiki that someone else has made, and/or tried to extend one, will very quickly come to the realisation that most Wiki markup forms are very poorly designed and considered. And that's before you look at the shocking implementations.

Look at Kwiki! If you installed an early version of Kwiki and pulled it apart, you will have seen a prime example of one of the biggest markup mistakes - the lack of recognition of a document structure. Instead, the document is modelled as a series of lines which regular expressions can be applied to, to break it up into discrete chunks of "formatted" blocks which have already been rendered. A simplistic approach, quite suitable for many uses, but ultimately doomed.

That's hardly relevant now - Brian Ingerson is perhaps one of the few programmers I've encountered, with attitude that belies his experience - a state of humility rarely inhabited by Perl programmers. And because of this, the above scathing comment is the complete opposite - it is precisely anti-scathing because he had the power to say "you know, you're right, and now I need to put this down and built it from scratch again".

But stop, you say, what about the RULES!

There are Three Virtues of being a programmer: Laziness, Impatience, and Hubris

These rules have to be taken in the context of programming. Just because you're a lazy programmer - trying to achieve as much as possible with your programs with the minimum amount of total development time, code and computing resources, do not have the patience to repeat code or endure slow algorithms, and should put ultimate pride into the works that you create - that does not mean that you should also be a lazy individual, not even helping your aging mother wash the dishes when you visit her for dinner, so impatient that you won't even hear out alternate points of view, and so arrogant that you truly believe that your own point of view trumps anything anyone else is likely to produce in your short and pathetic lifetime.

But I digress. We were talking about Plain Old Documentation. One of the fundamental mistakes in POD is that the access is in terms of these opaque things called "POD paragraphs", sequentially numbered chunks of text that came from the source file. But didn't we just gloss over that being wrong?

After all, there is some semblance of structure in POD documents - inline styles, like C<E<lt>>, are structured. =over, and =back are a form of structure.

So, if you could transform the stream of "POD paragraphs" into a stream of parse tokens, then give =for WHAT a nice way of doing the same, then you could potentially give POD dialects all the hooks they need to work everywhere they have to work.

But, the alarmists scream, "You're fracturing POD! POD should be SIMPLE! and PLAIN! and OLD!!! and DOCUMENTATION!!! HLARGHARAGHARHARHG!!!!"

But what do Scriptalicious, OODoc, Pod::Constants, Test::Inline, Pod::DocBook, Pod::XPath, Pod::Coverage, and a trailer park full of other similar modules have in common? They're all treating the documentation as a single reference point, by which incidental things like run-time values of constants or test cases are seperated from the real code. They're inferring structure where there is not much by design. They're effectively POD dialects.

So let's build a structure for these modules to receive structures rather than opaque chunks of text, and move the formatting rules about what C<codes> are used into one place rather than many.