Simplified parseable Changes: draft 2

hex on 2007-11-12T11:46:38

I got a lot of positive feedback about my proposal for a machine-readable Changes format. Following on from all the suggestions, this is the revised spec. The big difference from the first version is the expansion of the identifying tokens from single letters to full words, at Juerd's suggestion. The "@" symbol is also becoming overloaded these days, so I've dropped it.

  • Each release is represented by a block of lines. Double line breaks separate releases.
  • A line beginning with \s+ is interpreted as the continuation of the preceding line.
  • Each line begins with a token denoting what item of change metadata it describes, followed by a colon and \s+.
  • The token may be optionally suffixed with an exclamation mark importance indicator, implying the metadata item is important. When applied to a version number, it implies "major release". (Applying it to a date or comment is meaningless and should be ignored by any parser.) The rendition of an important item is down to parser authors.
  • Valid tokens are:
    • change - a change to the code of some kind
    • docs - a change to the documentation
    • fix - a bug fix
    • incompatible - an change that is incompatible with earlier releases
    • license - a change to the license
    • maintainer - a change to the maintainer(s)
    • new - an addition of something
    • security - a security fix
    • tests - a change to the test suite

[Changes to this post following comments: removed removed and re-added change.]

Following are two examples of valid documents in different styles.

# This version was codenamed Muffin because we were listening to Frank Zappa at the time.
version!:
    1.3
date:
    2007-11-08T11:15
maintainer!:
    This project is now maintained by ZIRCON (of Zircon Software fame).
license!:
    We have switched licenses. This software now uses the Greater Zork Software License.
    Please ensure that you have read the new license before using this software.
new!:
    New frobnitz() method - save 50 lines of manual frobnitzing by using this instead!
fix!:
    Fixed the error in quack() where it would actually moo instead of quack. [RT 1234]
incompatible!:
    The calling convention for rumpelstiltskin() has CHANGED. See perldoc.
tests!:
    Test coverage is now 100%! Please go nuts testing this release on your machines
    and let us know what happens.

This one has a more compact look:

version:      3.1
date:         1992-04-06
# You guys are going to love this one. -- billg
# Watch out for Kato, coming this October with native networking support! -- steveb
new:          TrueType font system. No more need for Adobe Type Manager.
new:          32-bit disk access.
new:          Awesome game called Minesweeper. Say goodbye to your productivity.
incompatible: We dropped Reversi. Minesweeper is better, trust us.
incompatible: Can't run in real mode.

Even more compact, without the nice alignment:

version: 1.2.3
date: 2007-11-12
new: beefsteak() gives you beefy goodness
fix: tracked down a memory leak in mtfnpy()
tests: added pod coverage
change: refactored ugly get/set methods into AUTOLOAD

Thoughts?


Restating thoughts

Juerd on 2007-11-12T15:17:15

"removed" is always also "incompatible". I don't think there is any need for having both.

I don't particularly like "version" and "date". They're different in the sense that they must (or at least should) always come first, but now look like all the others. It's not a big deal, though. I really think this proposal is much better than the previous. It's good that you considered multiline items too.

Can the markers be made case insensitive? I think people may like Version or SECURITY.

Re:Restating thoughts

hex on 2007-11-12T15:56:45

Copied from your comment on the first post...

> As for the timestamp, you'd have two things, whitespace separated, instead of one. dateTtime may be the standard, but date time is much more commonly seen in the wild. And for a very good reason.

Hm. I suppose it is meant to be a simple format. Okay - so long as the dates are yyyy-mm-dd[ hh:mm[:ss]] and nothing else, I can live with that.

I suppose you're right about "removed", in a sense, as it is a feature-oriented label. I just got used to it balancing "added". Let's drop it then.

> I think the important marker itself is not important if you split out security/incompatible. If something new is important, bump the version number.

I'd like to keep it because I'd like to be able to mark particular features as important, without tying myself to any particular scheme of version numbering.

> I don't particularly like "version" and "date". They're different in the sense that they must (or at least should) always come first, but now look like all the others.

I changed "@" and "v" to "date" and "version" because none of the other tokens were single letters/symbols any more (as well as the overloading of "@" that I mentioned); it seems worthwhile to aim for consistency to improve simplicity, imho.

> Can the markers be made case insensitive? I think people may like Version or SECURITY.

Okay, as long as it's explicitly stated that conforming parsers must not attach any relevance to token case.

Re:Restating thoughts

bart on 2007-11-12T19:33:35

"removed" is always also "incompatible". I don't think there is any need for having both.
I disagree. An addition is incompatible too. There's no harm in being a bit more specific!

To me, "incompatible" means a change in how things are used: the functionality is still there, only, you'll have to try to achieve your goal in a slightly different manner: for example, using another function, or with a change in parameters.

If older functionality is no longer available, that's more than just a change.

Re:Restating thoughts

Juerd on 2007-11-13T10:08:42

Incompatible, in a changelog, means: you should anticipate breakage caused by this change.

Pure additions typically never break existing code.

Maintainer?

jtrammell on 2007-11-12T15:36:24

I'm curious why "maintainer" is the only token that's abbreviated. Maybe "owner", if length is the issue?

Re:Maintainer?

hex on 2007-11-12T15:48:27

Good catch. Fixed.

Re:Maintainer?

hex on 2007-11-12T15:58:14

Although come to think of it, "docs" is an abbreviation, really... :-)

Re:Maintainer?

jtrammell on 2007-11-12T16:09:51

Indeed!

hmm!

Alias on 2007-11-12T23:38:29

all!:
those!:
exclamation!:
marks!:
look!:
really!:
weird!:

Re:hmm!

hex on 2007-11-13T10:30:47

They're meant to stand out.

Some comments ...

drhyde on 2007-11-13T15:01:40

I prefer single characters instead of words; + and - for additions and removals (like diff); removal and other changes should be distinct because removal does *not* imply incompatible (see the example below).

Any thoughts on how to deal with long text for any entry? I suggest that if a line starts with whitespace, it is assumed to be a continuation of the previous line:

-: removed the dependency on Foo::Bar, which fails its tests on some platforms.
   It is now optional, and without it frobnitz() won't work.
   (see RT ticket 12345).

I suppose some way of linking each change to a ticket (and of course specifying where the ticket is - rt.cpan? code.google?) and a CVS commit would be nice too, although it should be optional as most people aren't that anal.

Re:Some comments ...

hex on 2007-11-13T18:06:47

> I suggest that if a line starts with whitespace, it is assumed to be a continuation of the previous line

That'd be the part of the spec that says "A line beginning with \s+ is interpreted as the continuation of the preceding line" :-)

Re your other points, I liked the single letters more too but people were keen on the whole word versions. I'll let them respond...

I thought of maybe inventing a format for specifying bugtracker URLs but then it was getting a bit too complicated to merit the "simplified" title.

Re:Some comments ...

drhyde on 2007-11-13T18:20:07

Furrfu! You don't expect programmers to read the documentation do you? :-)

Re:Some comments ...

Juerd on 2007-11-13T21:31:55

That kind of removed is not the kind of removed I was thinking of. Such confusion is another reason to remove "removed" from the list.

However, it does uncover a missing part of the changelog spec: a keyword for changes that are neither bugfixes nor incompatible, like backend changes (removed dependencies) and performance optimizations. I suggest "change" because it is nicely neutral. (I was thinking of "improvement" first, but it is too long and internals changes can be done for other reasons too.)

Re:Some comments ...

hex on 2007-11-13T21:37:58

Actually, the first draft spec had "c" for "change" in it, and I think I dropped it by mistake when drawing up this new one. I agree that it should go back in.