CPANSTATS has turned out pretty interesting and the results are pretty cool. However, the number of systems running it has stabilised at eighty. This is obviously because only people who read my journal have installed it. I mean, who would install a script they downloaded off the Internet? ;-)
What we need now is to acheive critical mass. Get all the other people to take part in the project. The simple answer would be to integrate it into CPANPLUS. A great deal of people use CPANPLUS (and more will in the future if it gets into the Perl core). CPANPLUS tells you when it's out of date. The CPANSTATS results for it show that quite a few people are running the latest released version. Also, it removes a lot of my code by using the wonderful CPANPLUS::Backend. I'm always for deleting code...
It'll be new and exciting, but I haven't quite decided what it should do. Currently, I report stats for every module (eg CGI::Fast). Would it be simpler if I reported stats for each distribution (eg CGI in this case) instead? If it's built into CPANPLUS should it report stats automatically whenever you run it? Every week? Only if you explicitly tell it to? (By default, of course, it will be disabled). Does PAUSE contain historical data so I could show the release dates? Could it use collaborative filtering to suggest modules that you might like?
My brain is murky. The project could do a lot more, but at the moment I'm not quite sure what. Do you guys have any suggestions?
I'd definitely find stats for distributions simpler, as some of them have many modules that are barely significant. I think it's the best level of granularity. Reporting whenever it's run or every week would seem fine to me, and while I agree that it ought to be off by default, I think it should strongly recommend turning it on (because it's fun!). Collaborative filtering would simply rock, if only for the gadget value. It could also give you the names of the five top people on CPAN you most want to buy a beer to.
Do you guys have any suggestions?
I guess I'm in a minority, but I liked the functionality that I sent you the patch for to let the cpanstats script (with its heavy dependancy demands) run on one perl, but report the modules of another perl. Will this be lost with the integration into CPANPLUS, or will the CPANPLUS version still be able to probe another perl?
Re:probing other perl versions
acme on 2003-02-18T10:03:37
Well, the remote API will be pretty simple so it should be easy to code up a non-CPANPLUS script. The vast majority of results should come with CPANPLUS, however.
I work behind a fairly restrictive firewall, which also demands a username/password.. For some weird reason, the env_proxy parameter that you pass into the code doesnt pick up the username/password environment variables (so the submission fails)
However, PPM does work (just as a point of comparison, it uses HTTP as well, right ? ) Perhaps a way of sending statistics offline (do a -dryrun and mail the tar.gz somewhere ?) might work for a few ppl who dont have access to the net all the time (and its through a dog slow link when they do).. The number of submissions is now at 99, btw
Re:smallish bug report
acme on 2003-03-07T08:11:15
Thanks, I'm keeping this in mind for the next CPANSTATS project. I'll be creating lots more stats, and there will be options to send it to the server using a variety of methods.