CPANTS: author highscores

domm on 2005-03-25T10:17:12

Yesterday evening I found a spare hour and fiddled a bit with the CPANTS author highscores.

There are now two pages listing all authors with more than 5 dists and with 5 or less dists on CPAN sorted by average Kwalitee. I also did proper 'ex aequo' listings (eg, I'm currently sharing place 25 with BooK and Joshua Hoblitt)

Next up: Add up/down movement information to those lists, as in pop charts. As soon as I'm done with this, I'll set up a cron-job to do weekly reports. And after that: more metrics!


Authors pages?

Alias on 2005-03-25T12:02:15

Any chance of generating per-author module lists (for the +5 authors at least) so that we can tell which ones are letting us down?

Also, with PPI 0.903 now totally leak-safe, feel free to go nuts using it to generate new metrics :)

Re:Authors pages?

domm on 2005-03-25T13:30:44

Yes, that's planned. I'll probably get to it next week.

Damn

jk2addict on 2005-03-25T15:43:09

Your latest run missed my 6th module upload by "this much". :-)

From first glance, it looks like the numbers for people in the "top" lists are getting better.

metrics

cog on 2005-03-25T16:17:29

Where are the metrics, anyway? O:-)

I'd like to go through my modules and check out why I missed the first place :-P

Re:metrics

cog on 2005-03-25T16:19:11

Where are the metrics, anyway? O:-)

Disregard that! Next time I'll at the top of the page too O:-)

Time for run?

ambs on 2005-03-25T16:48:26

What's the time needed for a full run on CPAN? And, how much CPU/Disk does it cost?

I'm asking this because it would be interesting to have the Kwalitee to run every day... every week... depending on the time needed for a full run. Also, I can offer some CPU/Disk for this project.

Re:Time for run?

domm on 2005-03-29T10:27:12

The last run took:

real    173m22.169s
user    111m49.530s
sys     19m47.620s

a bit less than three hours (on an AMD Duron with 800Mhz (it's an old dev server..)).

But there is an option to only check new dists (unless there's a new metric..), but it has got a bug (autoincremented ID are mixes up with old ones).

My plan is to get finish the 'highscore'-list and add per-author info. Then set up weekly runs. Then add new metrics.

Daily runs would definitly be possible, but it will be hard to track improvements. Hmm. Maybe I should have two databases: a 'snapshot' of kwalitee at a given date, and a long-running DB tracking kwalitee improvements (but not very detailed..)