Sorting CPAN

barbie on 2008-11-19T12:46:33

One of the problems with the CPAN Testers website resources, is that where an author listing of distributions, or the list of versions for a distribution, is required, a lot of backend trawling is done. This is due to the current backends having to refer to 3 sources to get those lists. Even then the resulting lists aren't quite correct, as the version sorting can be slightly weird when you have to take into account every author has slightly different perception of versioning. Sort::Versions goes a long way, but it isn't 100% accurate. The only really accurate way of sorting is on the release date of a distribution, which until now hasn't existed in a single form.

For a couple of months now, CPAN Testers has had it's own BACKPAN and CPAN mirrors. Of the 3 sources these are represented by Parse::BACKPAN::Packages and Parse::CPAN::Distributions. and the 2 index files they use. These can take a long time to parse, and as they don't parse and return any release date for distributions and their version, using Sort::Versions is a reasonable alternative. However, there is a third source and that is the CPAN Uploads that are announced by PAUSE. Due to the time lag of the mirrors, very often a release can be made and not be available to CPAN for several hours, so while no CPAN Testers reports might exist, it's still important to know the latest version.

Previously the last source is the only one that contained any release date information, which prompted me to think about doing it for the other sources. Surprisingly quickly, using the local CPAN Testers copies of BACKPAN and CPAN, I was able to build a basic database of upload data, and tag each with 'backpan', 'cpan' or 'upload', to indicate in what state the release was currently at. Queries now take fractions of seconds instead of several seconds. But, and perhaps more importantly, the sorting of distributions actually makes more sense!

The new database is being integrated into the backend code at the moment, but for those that might wish to have this information available for their own uses, the complete database is publicly available at the following locations:

These will now be updated daily, and once everything else is in place will eventually updated hourly.


404

LaPerla on 2008-11-20T03:46:29

Very nice idea, thanks!

And could you check the URL again, it gives me a 404?

Re:404

barbie on 2008-11-20T08:09:23

Ooops, sorry about that. I'd mistyped the directory name on the server :( Thanks for letting me know :)