Changes.yml specs v0.01

brian_d_foy on 2007-09-07T11:50:00

OK, I'll bite. Here is a first rough draft for discussion and brain-storming, please post suggestions for add'l hash keys. Basic format of Changes.yml should be a stream of hashes, each hash defining one release.


---
version: 0.02 # plain number or version string, required
released: 2007-09-06 # datestring, required
author: Roland Giersig <RGiersig@cpan.org> # release author, required
stability: draft # stable | draft, required
backward-compatibility: partial # full | partial | none, required
breaks-compatibility-in: # list of subs, optional
  - foo()
  - bar()
bugs-fixed: # list of bugs with references, optional
  - foo() crashes when given undef parameter <http://rt.cpan.org/Ticket/Display.html?id=1234>
  - bar() should return "blah"
changes: # list of text, required
  - rearranged arguments to foo()
  - changed return value of bar()
---
version: 0.01
released: 2007-09-05
author: Roland Giersig <RGiersig@cpan.org> # release author, required
stability: draft
backward-compatibility: none
changes:
  - first release
---


What I'd want

Skud on 2007-09-07T13:52:46

I've been doing CPAN Watch for a couple of weeks now and I'm starting to get a handle on what I'd want if I had a machine-parsable changelog. A few random notes:

To me, the main distinction in changelogs is "noteworthy" and "not noteworthy". Bugfixes, additional features, and breaking backwards compatibility are noteworthy. Things like packaging fixes, test improvements, etc are not noteworthy. Admittedly my criteria are very personal here, but I think the distinction of "you might want to upgrade on account of this" vs "mostly of use to the maintainer" is worth noting.

A week or so back I came across some kind of XML format for changelogs. I have no recollection as to what it was, just that I hated it for its non-human-readability. You might want to investigate and see what it includes.

Random things that you might want to be able to capture in your YAML format:

  • change of license
  • change of maintainer
  • flag for "major release!!!!1"
  • packaging changes (is license change a subset of this?)
  • test changes
  • documentation changes
  • bugfixes
  • features added *or* removed
  • changes in maturity, eg "I now consider this to be out of alpha, and ready for use by others" or "this is a release candidate"
  • developer releases?

And please, make the YAML format be in reverse chronological order. This is much better for human readability because hte most recent changes are most readily apparent.

Re:What I'd want

RGiersig on 2007-09-07T14:30:52

Hmm, cpan watch would really be a major user of this, so I'll keep that in mind. So I guess making too many things in Changes.yml optional is bad, either an author should include all infos or just leave out Changes.yml alltogether.

So the release info should contain human-readable info on the one hand like a one-sentence synopsis for cpan-watch and parseable info on the other hand like API-changes, changes in dependencies etc that allows some nifty tools to combine that with info from other packages to e.g. grade the module with regards to its complexity.

I just took a look into META.yml and to me it looks like that that info should also merge into Changes.yml.

Also included should be the DSLIP info from the long module list (http://www.cpan.org/modules/00modlist.long.html) which would be helpful to point out requirement changes from one version to the next.

Hmm, this is getting long... :o)

Re:What I'd want

sjn on 2007-09-07T15:52:46

META.yml is rather established, and quite useful for determining the current state of a package. I don't think it's productive to speculating in merging files or making one replace the other.

Instead, I suggest that Changes.yml ONLY contains information regarding the changes between releases (as opposed to info regarding about the release itself, which is available in META.yml). That would mean fields like "author" should be redundant/meaningless in Changes.yml.

Furthermore, there's much to be won by making it easy to write a Changes.yml file - if people are "forced" to write a lot of redundant information in order to make a "good" or "valid" Changes.yml file, then they may conclude that this may not be worth the effort. In other words, there may be some worth in considering what fields "MUST", "SHOULD" or "MAY" be required. :)

Other than that - good stuff! Keep up the good work! :)

Re:What I'd want

RGiersig on 2007-09-09T08:53:02

Right, I don't intend to merge files, and be it just for backward-compatibility.

With "author" I meant the author of that specific release, not necessarily of the whole module. I don't know if "pumpking" would be better, as I see that title as a honor to those that coordinate releases on really large module/packages/whatever. Maybe "brought-to-you-by" would be better... No, let that be "release-author". :-)

[Which brings me to the point if there shouldn't be a list of bug/patch contributors that could be automatically parsed and assembled to a "honorable mentions" page somewhere...]

Regarding the optional info, I would go another way that strikes me as psychologically good, I would provide a template that contains all the tags with default values, so if an author doesn't want to fill something out, he doesn't need to because it is already filled out. This template would ideally be auto-generated in the standard "perl Makefile.PL" process.

And as a second step I would vote for a nag message for "make dist" that would check the existance of a Changes.yml file and nag the author to fill out the template (without requiring it), also giving a link to more info about Changes.yml.

Maybe something more along the TAP philosophy?

GunnarWolf on 2007-09-07T16:23:40

I really like the TAP in that whatever its parser does not understand is taken as a comment (and ignored, of course). I'd suggest you to take a look at how Debian handles its changelogs: They have a clearly defined format to allow programs to harvest its most important bits of information (i.e. version, timestamp, maintainer, priority...). Each changelog entry line is then basically text, but can specify some action (i.e. having a «(Closes #nnnnnn)» at the end causes that bug to be closed. This has the advantage to clearly point out which action caused which bug to be closed - it's not the whole version which closes a bug, but a specific action included in it.

Re:Maybe something more along the TAP philosophy?

GunnarWolf on 2007-09-07T16:27:18

Just to illustrate my point, the changelog for mod_perl2 in our SVN repository.

Maybe we should take a step back...

RGiersig on 2007-09-08T22:57:34

Thniking about everything I come to a wider viewpoint: the Changelog is actually a diff between two module feature descriptions. So how about doing exactly that: lets create a formalized feature description for each version. This description needs only be quite basic, because we only want the diffs. So the first release is described by its documentation and following releases are adding and changing its features.

Come to think of, the process is in two steps:

With each release, a new specification is created. After the release, the original specification is retroactively changed by reported but previously unknown "features" that we normally call "bugs". During bug-fixing, these "features" are removed and the change is documented in the specification of the next release.

I don't know what I will make of all this, it's rather late and I'm going to bed now.

Does this make any sense for you others that are not me? :o)

Re:Maybe we should take a step back...

sjn on 2007-09-10T11:17:38

Interesting idea, but perhaps it's better not to over-engineer this early? :)

I think it would be better to define a minimal spec that's easy to implement, and possible to extend when it's useful to do that. And of course, it should be human-readable/scannable firstly (as that's the role of the Changes files today). Using valid YAML ought to fix the "machine-parsable" requirement nicely.

YAML.pm seems to have tried to do something in that direction too, btw. Maybe worth taking a look/having a chat with some of the guys behind that module?

Re:Maybe we should take a step back...

RGiersig on 2007-09-10T15:45:57

Well, I see adaption of such a thing mainly as a psychological problem, not a technical one. So I want to get the technique right, and this means to decide what semantics we want here. But you are right, lets keep it small and simple. :-)

Yes, the YAML changelog is valid YAML. Apart from that, they started out to put each change as one stream entry, thus generating multiple entries with the same version instead of having one entry per version and the changes as a list. Well, that's a mistake that I wont make... :-)