The next level of (CPAN) awareness...

Alias on 2009-08-27T11:49:51

The goal of the CPAN Top 100 website/experiment is to drive the CPAN's awareness of itself.

Researchers in artificial intelligence, emergent behaviour and machine consciousness are still trying to grasp the main concepts of, and reasons for, consciousness.

One of my favourite explanations for the evolutionary reasons behind consciousness is that the increase in knowledge and judgement (based on this knowledge) is that it allows an increase in the efficiency of resource allocation (and a resulting increase in fitness).

As for the skills needed, one of the biggest is to be able to sense the passage of time and be aware of not only aggregate priorities, but to separate priorities into different time scales.

Identifying instantaneous, short, medium and long term priorities allows allocation of more-common, short-time-period volunteer resources to tasks where they can be immediately useful (without these people being distracted by larger long-term goals they are unable to help with).

This is something that I'm afraid I can't do at all yet, but I'm searching for ways to gain this important skill.

As evidence of my failing, and the reason it is important, take the current example of Test::Class

This is a very common, very important test module. 73 other modules depend on t it. Not Top 100 material, but still a big deal.

Currently, the FAIL 100 algorithm identifies it as the 35th highest priority module for attention based on the level of failing tests and the number of things that depend on it.

However, one look at the CPAN Testers page for Test-Class tells an entirely different story. 35th is not the kind of ranking that a human would place on it. A human would noticed the trending of the failure, the suddenness with which it started, and the likelyhood of the problem to cause damage in the future.

http://www.cpantesters.org/distro/T/Test-Class.html#Test-Class-0.31

Clearly the current aggregate analysis techniques are just not good enough.

The experiment continues...

A classic example of one of CPAN's major failings

tgape on 2009-08-28T01:51:24

Just in case I'm misinterpreting that output, it looks to me very much like Test::Class has a dependency which has upgraded to a broken version.

This reminds me of two things I'd really like for CPAN to have:

1. Ability to blacklist specific module versions, both in requirements (my module requires File::Next >= 1.0 and not 1.04, for example) and overall (effectively, we pull this version from the repository.)

2. An option to install the test scripts for each module some place, and a script to re-run those test scripts when considering installing a module in the dependency graph for the associated module. That is, if I have App::Ack installed, along with its tests, and I attempt to install File::Next 1.04, while it passes its own tests, it fails App::Ack's tests, and so it's not blindly installed, and therefore my system isn't broken.

I do realize that the latter feature would require a lot of work and additional features - for example, when processing an install queue, build a single blib directory for all of the modules in the queue, so that the tests can run in the context of all of the upgraded software, instead of just what we have processed thus far. (That would also be useful for handling circular dependencies - that problem goes away, so long as they aren't build-depends.) But there's a lot of things that take a lot of work but are really needed.

Re:A classic example of one of CPAN's major failin

Alias on 2009-08-28T04:16:59

> 1. Ability to blacklist specific module versions, both in requirements
> (my module requires File::Next >= 1.0 and not 1.04, for example) and
> overall (effectively, we pull this version from the repository.)
This one is certainly possible for non-configure_requires dependency, because we already have a turing-complete configuration phase. In pseudo code...
if ( installed File::Next is 1.04 ) { requires File::Next 1.05 } else { requires File::Next 1.0 }
Idea 2. is trickier because they may not necessarily be the best way to achieve what you want. Also, the dependency graph information for your module is discarded at install-time. So even your implementation has a dependency on installation of dependency meta-data and a way to reverse-resolve them.