Thanks for everybody's input on my previous post.
In addition to the comments, I received another very interesting data point from an email with Tom Hughes, the IO::Zlib maintainer (currently holding #1 on the FAIL 100 list).
> It's showing up on my graph-weighted FAIL tracker as the number one > source of problems at the moment.This email response is great, because it demonstrates an important factor in maintainership.
Well if nobody tells me about these things I can't possibly do anything about them...
Well if nobody tells me about these things I can't possibly do anything about them...
This seems like a side issue to the Fail 100 notification. CPAN testers already provide many ways to track distribution test failures, including e-mail digests and RSS feeds.
In the IO::Zlib case, there are:
It seems that more CPAN authors should be made aware of the testing resources available to them. I'm not sure they need to be educated on a weekly basis, however.
Re:Sorry I'm late....
Alias on 2009-07-15T01:01:55
Despite all this information, the author still felt like he wasn't told.
To prevent problems of being horribly spammy, all our current CPAN-wide mechanisms don't really take the initiative to reach out to the author.
And if they do, they don't really convey the level of urgency of the problem.
What I'm hoping we can do in this case is to provide an extremely low-volume mechanism that you as a normal CPAN author will never see. For example, POE has never appeared on the FAIL 100 list since I started tracking it.
But if, at some point in the future, something goes horribly wrong with a release (and by some stroke of weird chance nobody on the POE team notices) you will be poked by an email of a type you have never seen before and are thus more likely to pay attention to.
This email should not be there to just annoy you, it should point you towards resources like the ones you describe above. If you have the absolute top module out of 17,000 causing trouble for everyone else, I think a weekly email might be appropriate.
I'm already noticing that some of these toolchain modules showing failures up in the Top 10 have failures that are obscure or wrong, so hopefully in the process setting up a communications environment that pushes to the author we can set up extra priority resources.
For example, we could get commitments from the CPAN Testers to prioritise testing around this #1 position module, to provide testing VMs to them, and to have an existing set of volunteers in a dedicated IRC channel willing to take over or help hack on the #1 module specifically.
So yes, while there are resources available, I feel like these resources don't provide a level of prioritisation so more benefit is provided to the things that are the biggest problem.
Re:Sorry I’m late….
Aristotle on 2009-07-15T01:27:25
For example, we could get commitments from the CPAN Testers to prioritise testing around this #1 position module
Do be careful to avoid self-reinforcing bias when you do this.
Re:Sorry I’m late….
Alias on 2009-07-15T03:17:30
I actually meant more in terms of having the CPAN Testers looking at new releases faster, or having them commit to providing better levels of direct access to their hosts.
The self-reinforcing bias is interesting though, because in a sense it may be a positive thing.
If anything making it to the #1 position is then subject to even more intense examination that boosts it's score higher, this not only providing more data, but it helps expose more edge cases in what might be a quite edgy module anyway.
So once you clean up the module for the next release, you stand a better chance of not reappearing on the list in the future, compared to a situation in which you fix one bug and release to hit the reset counter on CPAN Testers only to slowly drift to the top of the list again.
Remember, the goal of the FAIL 100 list is not to judge modules as being inherently good or bad, it's to identify the places in which we get the maximum benefit for our maintenance time.
Now if this were a judgement call, a way of placing inherent value on the modules (such as the Kwalitee metrics) then I think this bias would be a bigger risk.
Re:Sorry I’m late….
Aristotle on 2009-07-15T04:04:37
I was too terse, sorry.
What I meant by self-reinforcement referred to the ranking, not the extra scrutiny. That is, the extra scrutiny from being at #1 is good – but be careful about whether/how you weigh that extra scrutiny in the next ranking recalculation. Otherwise, just making #1 may increase the chances of staying there for no other reason than the extra scrutiny (which is only due to making #1) causing extra FAILs that would not have turned up otherwise – fortifying the position against modules that should by substantial merits take #1.
It’s the classic popularity contest feedback cycle.
Re:Sorry I’m late….
Alias on 2009-07-15T07:21:07
I see.
In this case that isn't a problem, because of the way that the ranking is calculated.
Only the most recent production release is counted, so as soon as #1 does a production release they get their score reset.
Also, I've noticed that the rankings follow a power law anyway, so even if it does undergo some extra scrutiny that's ok, because it took a hell of a lot for it to get that high in the first place.