There's a lot of discussion going on at the moment about machine-readable Changes (or CHANGES) files: miyagawa, LTjake. hanekomu put together a new module, Module::Changes, to parse a "Changes.yml" file; RGiersig made some suggestions for the content of that file.
Discussion so far has mainly been around the use of of YAML. Points raised:
Thinking about all of these, I propose the following. Design constraints were (a) granularity (including Skud's suggestions of what to mention), (b) an absolute minimum of chrome, and (c) trivial to transform into other formats (such as RDF).
v! 1.3
@ 2007-11-08T11:15
# This version was codenamed Muffin because we were listening to Frank Zappa at the time.
m! This project is now maintained by ZIRCON (of Zircon Software fame).
l! We have switched licenses. This software now uses the Greater Zork Software License.
Please ensure that you have read the new license before using this software.
a! New frobnitz() method - save 50 lines of manual frobnitzing by using this instead!
b! Fixed the error in quack() where it would actually moo instead of quack. [RT 1234]
c! The calling convention for rumpelstiltskin() has CHANGED. See perldoc.
t! Test coverage is now 100%! Go us!
v 1.3_01
@ 2007-11-07T09:20
# Developer preview for 1.3 and the CPAN testers.
v 1.2.1
@ 2007-11-02T20:08
d Fixed some POD formatting mistakes.
c Refactored accessors into AUTOLOAD. Makes no external difference.
r Removed the deprecated honkhonkhonk() method as warned several versions ago.
As you can see, each version is represented by a block of lines. Double line breaks separate versions. Each line begins with a token denoting what it describes, optionally suffixed with an exclamation mark, which means "important". When applied to a version number, it implies "major release". (Applying it to a date or comment is meaningless and should be ignored by any parser.) The token is followed by \s+. If an item is split onto multiple lines, it is understood to continue until a new token or block break is reached.
These are the tokens:
@ Release date. In W3C datetime format (ISO 8601).
# A comment.
a An addition to the code.
b A bugfix. Linking to a ticket here would be nice if it exists.
c A change to existing code.
d A change to documentation.
l A change to licensing.
m A change to the maintainer.
r A removal of something from the code.
t A change to tests.
v A version number.
I haven't gone quite as far as RGiersig did in his specification, as I felt that was a bit heavy. For example, release stability in my scheme is indicated by the version number - that should be implied from the existing convention of underscored version numbers for developer releases.
Vague other thoughts - case-insensitive tokens? And maybe a standard block of comments at the beginning of the file explaining what the tokens are to new readers.
Thoughts? I actually like this enough that I might start using it myself.
Update: There's a second draft now.
Re:using x! instaead of !x for important items
hex on 2007-11-09T13:22:03
Good point - I changed it.
Re:Hard to read...
hex on 2007-11-09T14:47:47
The spec allows you to do this if you want:
a!
Added some groovy new feature.
b
Fixed that stupid little bug in the gnomon.Any better?
Each line begins with a token denoting what it describes... The token is followed by \s+. If an item is split onto multiple lines, it is understood to continue until a new token or block break is reached.
Maybe I missed something, but what do you do if the word 'a' is the first letter of an item split onto multiple lines? How does the parser know that's not a token?
Re:Ambiguity
hex on 2007-11-09T16:30:12
Ooh, good catch. As it stands, it wouldn't. The workaround is not to split a line before an "a":-)
If anyone can think of a patch to the spec to fix that without adding complexity (I can't off the top of my head) I'd be interested to hear it.Re:Ambiguity
Ovid on 2007-11-09T17:13:14
As a format which could conceivably be written in other (human) languages, can you guarantee that none of them will have the same issue? Or that someone might refer to their 'd' subroutine and mess things up?
Maybe subsequent lines could be indented or the preceding line could end in a backslash?
Re:Ambiguity
hex on 2007-11-09T17:41:02
confound on IRC suggested starting continued lines with a '.', but that's more chrome to impede a quick visual scan of the document, as are backslashes. On the other hand, the backslash is a well-known line continuation indicator. I prefer though your suggestion of indenting. Leading whitespace already seems to be commonly used on CPAN to indicate a continued comment.
a We added a new shiny feature that you'll all love:
a magic automatic doodad configurator.
b! A major bug got fixed. Really major. It was so awful,
in fact, that I can only talk about it in Latin:
Lorem ipsum dolor sit amet, consectetuer adipiscing
elit. Nulla iaculis mi quis mi. Quisque nibh neque,
gravida quis, bibendum vitae, aliquet ut, enim.
Re:remove/delete
hex on 2007-11-10T14:19:39
Hmm. "d" for "delete" is a good point, however I find the phrasing "I deleted a feature" a little awkward.
How about we take a leaf out of diff -u's book and circumvent the issue of which word to use?
- removed something
+ added somethingThere's no ambiguity in that...
Re:remove/delete
Eric Wilhelm on 2007-11-11T06:26:44
I find the alphabetical codes rather unreadable. They mix too much with the text. Having tags for version number and date seems redundant when those two items are essential (and currently standard anyway.) In my rendition, the version and date are a non-indented header over a set of indented paragraphs which start with sigils.
v0.1.1 2007-11-10
+ added new thing() method
- removed old deal() method
* fixed bug #12578
% changed code for blah()
? documentation updates for bop()
$ license change
^ maintainer change
= fixed tests on VMS
! incompatible change notice blah blah
I think most of those sigils are self-explanatory, except maybe 'maintainer' -- it looks like a house.Re:remove/delete
Aristotle on 2007-11-11T07:49:03
I thought the metaphor for the maintainer symbol was that someone changed their “hat” (as in “putting on my group leader hat, I say that […]”). That seemed funnily apt to me.
Could even uppercase the tags...v 1.00
@ 1234-12-34 12:34
# I'm so happy with this release
security: fixed buffer overflow in Foo->cookies
fix: orange should have been blue, not red.
incompatible: removed emacs support
Result: writable AND readable, and important things stand out because they're longerv 1.00
@ 1234-12-34 12:34
# I'm so happy with this release
SECURITY: fixed buffer overflow in Foo->cookies
FIX: orange should have been blue, not red.
INCOMPATIBLE: removed emacs support
Re:Compatibility and security
Skud on 2007-11-11T23:11:45
I agree with Juerd on all points, but most especially that using a word rather than a letter helps a lot with readability. So, what he said.Re:Compatibility and security
hex on 2007-11-12T00:18:41
Compatibility, security, fix: agree that these are necessary splits to "bug fix" ("b" in my original scheme).Timestamps: these follow the format specified in ISO 8601, where the "T" is a mandatory separator. I'd like to stick to an existing standard of date representation if possible.
I think uppercasing is too shouty... adding the important marker would make you end up with "FIX! SECURITY! NEW!". It's a bit tabloid newspaper.
:-) With all this in mind I'm going to post a revised spec shortly for a second round of comments. Thanks!
Re:Compatibility and security
Juerd on 2007-11-12T10:45:38
I think the important marker itself is not important if you split out security/incompatible. If something new is important, bump the version number.
As for the timestamp, you'd have two things, whitespace separated, instead of one. dateTtime may be the standard, but date time is much more commonly seen in the wild. And for a very good reason.